03-16 04:37:09.696 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:49 cmd_run(): start to cmd run: grep Port /etc/ssh/sshd_config 03-16 04:37:09.696 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51 cmd_run(): grep 03-16 04:37:09.696 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51 cmd_run(): Port 03-16 04:37:09.696 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51 cmd_run(): /etc/ssh/sshd_config #Port 22 #GatewayPorts no 03-16 04:37:09.700 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 04:37:09.700 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51 cmd_run(): nvidia-smi Wed Mar 16 04:37:09 2022 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 470.82.01 Driver Version: 470.82.01 CUDA Version: 11.4 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 Tesla V100-SXM2... On | 00000001:00:00.0 Off | 0 | | N/A 38C P0 40W / 300W | 0MiB / 32510MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 1 Tesla V100-SXM2... On | 00000002:00:00.0 Off | 0 | | N/A 41C P0 44W / 300W | 0MiB / 32510MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 2 Tesla V100-SXM2... On | 00000003:00:00.0 Off | 0 | | N/A 39C P0 42W / 300W | 0MiB / 32510MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 3 Tesla V100-SXM2... On | 00000004:00:00.0 Off | 0 | | N/A 41C P0 41W / 300W | 0MiB / 32510MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 4 Tesla V100-SXM2... On | 00000005:00:00.0 Off | 0 | | N/A 37C P0 41W / 300W | 0MiB / 32510MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 5 Tesla V100-SXM2... On | 00000006:00:00.0 Off | 0 | | N/A 40C P0 42W / 300W | 0MiB / 32510MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 6 Tesla V100-SXM2... On | 00000007:00:00.0 Off | 0 | | N/A 40C P0 44W / 300W | 0MiB / 32510MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ | 7 Tesla V100-SXM2... On | 00000008:00:00.0 Off | 0 | | N/A 41C P0 41W / 300W | 0MiB / 32510MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+ 03-16 04:37:14.822 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:49 cmd_run(): start to cmd run: ifconfig 03-16 04:37:14.822 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51 cmd_run(): ifconfig docker0: flags=4099 mtu 1500 inet 172.17.0.1 netmask 255.255.0.0 broadcast 172.17.255.255 ether 02:42:5e:07:9c:92 txqueuelen 0 (Ethernet) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 eth0: flags=4163 mtu 1500 inet 10.0.0.8 netmask 255.255.224.0 broadcast 10.0.31.255 inet6 fe80::222:48ff:fe77:1ca5 prefixlen 64 scopeid 0x20 ether 00:22:48:77:1c:a5 txqueuelen 1000 (Ethernet) RX packets 167468324 bytes 88391604986 (88.3 GB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 113298509 bytes 151614183851 (151.6 GB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 lo: flags=73 mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10 loop txqueuelen 1000 (Local Loopback) RX packets 172540795 bytes 9259304630 (9.2 GB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 172540795 bytes 9259304630 (9.2 GB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 03-16 04:37:14.826 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:49 cmd_run(): start to cmd run: df -h 03-16 04:37:14.826 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51 cmd_run(): df 03-16 04:37:14.826 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51 cmd_run(): -h 03-16 04:37:14.830 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:49 cmd_run(): start to cmd run: ls /dev 03-16 04:37:14.831 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51 cmd_run(): ls 03-16 04:37:14.831 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51 cmd_run(): /dev autofs bsg btrfs-control console core cpu_dma_latency cuse dri ecryptfs fb0 fd full fuse gdrdrv hpet hwrng infiniband input isst_interface kmsg knem loop0 loop1 loop2 loop3 loop4 loop5 loop6 loop7 loop-control mapper mcelog mem mqueue net null nvidia0 nvidia1 nvidia2 nvidia3 nvidia4 nvidia5 nvidia6 nvidia7 nvidia-caps nvidiactl nvidia-modeset nvidia-nvswitchctl nvidia-uvm nvidia-uvm-tools nvram port ppp psaux ptmx ptp0 ptp1 pts random rfkill rtc0 sda sda1 sda14 sda15 sdb sdb1 sg0 sg1 shm snapshot stderr stdin stdout tty tty0 tty1 tty10 tty11 tty12 tty13 tty14 tty15 tty16 tty17 tty18 tty19 tty2 tty20 tty21 tty22 tty23 tty24 tty25 tty26 tty27 tty28 tty29 tty3 tty30 tty31 tty32 tty33 tty34 tty35 tty36 tty37 tty38 tty39 tty4 tty40 tty41 tty42 tty43 tty44 tty45 tty46 tty47 tty48 tty49 tty5 tty50 tty51 tty52 tty53 tty54 tty55 tty56 tty57 tty58 tty59 tty6 tty60 tty61 tty62 tty63 tty7 tty8 tty9 ttyprintk ttyS0 ttyS1 ttyS10 ttyS11 ttyS12 ttyS13 ttyS14 ttyS15 ttyS16 ttyS17 ttyS18 ttyS19 ttyS2 ttyS20 ttyS21 ttyS22 ttyS23 ttyS24 ttyS25 ttyS26 ttyS27 ttyS28 ttyS29 ttyS3 ttyS30 ttyS31 ttyS4 ttyS5 ttyS6 ttyS7 ttyS8 ttyS9 udmabuf uhid uinput urandom userio vcs vcs1 vcs2 vcs3 vcs4 vcs5 vcs6 vcsa vcsa1 vcsa2 vcsa3 vcsa4 vcsa5 vcsa6 vcsu vcsu1 vcsu2 vcsu3 vcsu4 vcsu5 vcsu6 vfio vga_arbiter vhost-net vhost-vsock vmbus zero zfs + ulimit -n 262144 + cat /etc/security/limits.conf # /etc/security/limits.conf # #Each line describes a limit for a user in the form: # # # #Where: # can be: # - a user name # - a group name, with @group syntax # - the wildcard *, for default entry # - the wildcard %, can be also used with %group syntax, # for maxlogin limit # - NOTE: group and wildcard limits are not applied to root. # To apply a limit to the root user, must be # the literal username root. # # can have the two values: # - "soft" for enforcing the soft limits # - "hard" for enforcing hard limits # # can be one of the following: # - core - limits the core file size (KB) # - data - max data size (KB) # - fsize - maximum filesize (KB) # - memlock - max locked-in-memory address space (KB) # - nofile - max number of open files # - rss - max resident set size (KB) # - stack - max stack size (KB) # - cpu - max CPU time (MIN) # - nproc - max number of processes # - as - address space limit (KB) # - maxlogins - max number of logins for this user # - maxsyslogins - max number of logins on the system # - priority - the priority to run user process with # - locks - max number of file locks the user can hold # - sigpending - max number of pending signals # - msgqueue - max memory used by POSIX message queues (bytes) # - nice - max nice priority allowed to raise to values: [-20, 19] # - rtprio - max realtime priority # - chroot - change root to directory (Debian-specific) # # # #* soft core 0 #root hard core 100000 #* hard rss 10000 #@student hard nproc 20 #@faculty soft nproc 20 #@faculty hard nproc 50 #ftp hard nproc 0 #ftp - chroot /ftp #@student - maxlogins 4 # End of file + ulimit -n 999999 + ulimit -Hn 999999 + ulimit -Sn 999999 + ulimit -n 999999 + pip install -r requirements.txt Collecting git+https://github.com/rwightman/pytorch-image-models.git (from -r requirements.txt (line 49)) Cloning https://github.com/rwightman/pytorch-image-models.git to /tmp/pip-req-build-8enxxbsc Running command git clone -q https://github.com/rwightman/pytorch-image-models.git /tmp/pip-req-build-8enxxbsc Requirement already satisfied: Deprecated in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 1)) (1.2.13) Requirement already satisfied: pymongo in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 2)) (4.0.2) Requirement already satisfied: azure in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 3)) (2.0.0) Requirement already satisfied: azure-storage-blob==2.1.0 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 4)) (2.1.0) Requirement already satisfied: Cython in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 5)) (0.29.23) Requirement already satisfied: django in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 6)) (3.2.12) Requirement already satisfied: easydict in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 7)) (1.9) Requirement already satisfied: ete3 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 8)) (3.1.2) Requirement already satisfied: future in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 9)) (0.18.2) Requirement already satisfied: ipython in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 10)) (7.32.0) Requirement already satisfied: jinja2 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 11)) (3.0.3) Requirement already satisfied: scikit-image in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 12)) (0.19.2) Requirement already satisfied: matplotlib in /opt/conda/lib/python3.7/site-packages/matplotlib-3.4.2-py3.7-linux-x86_64.egg (from -r requirements.txt (line 13)) (3.4.2) Requirement already satisfied: nltk in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 14)) (3.7) Requirement already satisfied: opencv-python in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 15)) (4.5.5.64) Requirement already satisfied: orderedset in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 16)) (2.0.3) Requirement already satisfied: pathos in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 17)) (0.2.8) Requirement already satisfied: pillow in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 18)) (8.1.0) Requirement already satisfied: progressbar in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 19)) (2.5) Requirement already satisfied: protobuf in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 20)) (3.19.4) Requirement already satisfied: psutil in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 21)) (5.9.0) Requirement already satisfied: python-magic in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 23)) (0.4.25) Requirement already satisfied: pyyaml in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 24)) (5.4.1) Requirement already satisfied: simplejson in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 25)) (3.17.6) Requirement already satisfied: traceback2 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 26)) (1.4.0) Requirement already satisfied: tb-nightly in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 27)) (2.9.0a20220313) Requirement already satisfied: yacs in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 28)) (0.1.8) Requirement already satisfied: sklearn in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 29)) (0.0) Requirement already satisfied: torchlars in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 30)) (0.1.2) Requirement already satisfied: boto3 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 31)) (1.21.20) Requirement already satisfied: anytree in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 32)) (2.8.0) Requirement already satisfied: Ninja in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 33)) (1.10.0.post2) Requirement already satisfied: pytorch_lamb in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 34)) (1.0.0) Requirement already satisfied: timm in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 35)) (0.5.5) Requirement already satisfied: dataclasses in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 36)) (0.6) Requirement already satisfied: pytorch_lightning==1.1.4 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 37)) (1.1.4) Requirement already satisfied: transformers==4.2.1 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 38)) (4.2.1) Requirement already satisfied: torchvision==0.8.0 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 39)) (0.8.0) Requirement already satisfied: torch==1.7.0 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 40)) (1.7.0) Requirement already satisfied: tqdm==4.56.0 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 42)) (4.56.0) Requirement already satisfied: ipdb==0.13.4 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 43)) (0.13.4) Requirement already satisfied: numpy==1.20.0 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 44)) (1.20.0) Requirement already satisfied: einops==0.3.0 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 45)) (0.3.0) Requirement already satisfied: pyarrow==2.0.0 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 46)) (2.0.0) Requirement already satisfied: sacred==0.8.2 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 47)) (0.8.2) Requirement already satisfied: pandas==1.1.5 in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 48)) (1.1.5) Requirement already satisfied: numba in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 50)) (0.55.1) Requirement already satisfied: kmeans_pytorch in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 51)) (0.3) Requirement already satisfied: pycocotools==2.0.0 in /opt/conda/lib/python3.7/site-packages/pycocotools-2.0-py3.7-linux-x86_64.egg (from -r requirements.txt (line 52)) (2.0) Requirement already satisfied: fairscale in /opt/conda/lib/python3.7/site-packages (from -r requirements.txt (line 53)) (0.4.2) Requirement already satisfied: azure-common>=1.1.5 in /opt/conda/lib/python3.7/site-packages (from azure-storage-blob==2.1.0->-r requirements.txt (line 4)) (1.1.28) Requirement already satisfied: azure-storage-common~=2.1 in /opt/conda/lib/python3.7/site-packages (from azure-storage-blob==2.1.0->-r requirements.txt (line 4)) (2.1.0) Requirement already satisfied: fsspec[http]>=0.8.1 in /opt/conda/lib/python3.7/site-packages (from pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (2022.2.0) Requirement already satisfied: tensorboard>=2.2.0 in /opt/conda/lib/python3.7/site-packages (from pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (2.8.0) Requirement already satisfied: sacremoses in /opt/conda/lib/python3.7/site-packages (from transformers==4.2.1->-r requirements.txt (line 38)) (0.0.49) Requirement already satisfied: requests in /opt/conda/lib/python3.7/site-packages (from transformers==4.2.1->-r requirements.txt (line 38)) (2.24.0) Requirement already satisfied: importlib-metadata in /opt/conda/lib/python3.7/site-packages (from transformers==4.2.1->-r requirements.txt (line 38)) (4.11.3) Requirement already satisfied: regex!=2019.12.17 in /opt/conda/lib/python3.7/site-packages (from transformers==4.2.1->-r requirements.txt (line 38)) (2022.3.15) Requirement already satisfied: packaging in /opt/conda/lib/python3.7/site-packages (from transformers==4.2.1->-r requirements.txt (line 38)) (21.3) Requirement already satisfied: filelock in /opt/conda/lib/python3.7/site-packages (from transformers==4.2.1->-r requirements.txt (line 38)) (3.6.0) Requirement already satisfied: tokenizers==0.9.4 in /opt/conda/lib/python3.7/site-packages (from transformers==4.2.1->-r requirements.txt (line 38)) (0.9.4) Requirement already satisfied: typing-extensions in /opt/conda/lib/python3.7/site-packages (from torch==1.7.0->-r requirements.txt (line 40)) (3.7.4.3) Requirement already satisfied: setuptools in /opt/conda/lib/python3.7/site-packages (from ipdb==0.13.4->-r requirements.txt (line 43)) (52.0.0.post20210125) Requirement already satisfied: munch<3.0,>=2.0.2 in /opt/conda/lib/python3.7/site-packages (from sacred==0.8.2->-r requirements.txt (line 47)) (2.5.0) Requirement already satisfied: py-cpuinfo>=4.0 in /opt/conda/lib/python3.7/site-packages (from sacred==0.8.2->-r requirements.txt (line 47)) (8.0.0) Requirement already satisfied: wrapt<2.0,>=1.0 in /opt/conda/lib/python3.7/site-packages (from sacred==0.8.2->-r requirements.txt (line 47)) (1.14.0) Requirement already satisfied: GitPython in /opt/conda/lib/python3.7/site-packages (from sacred==0.8.2->-r requirements.txt (line 47)) (3.1.27) Requirement already satisfied: jsonpickle<2.0,>=1.2 in /opt/conda/lib/python3.7/site-packages (from sacred==0.8.2->-r requirements.txt (line 47)) (1.5.2) Requirement already satisfied: docopt<1.0,>=0.3 in /opt/conda/lib/python3.7/site-packages (from sacred==0.8.2->-r requirements.txt (line 47)) (0.6.2) Requirement already satisfied: colorama>=0.4 in /opt/conda/lib/python3.7/site-packages (from sacred==0.8.2->-r requirements.txt (line 47)) (0.4.4) Requirement already satisfied: pytz>=2017.2 in /opt/conda/lib/python3.7/site-packages (from pandas==1.1.5->-r requirements.txt (line 48)) (2021.3) Requirement already satisfied: python-dateutil>=2.7.3 in /opt/conda/lib/python3.7/site-packages/python_dateutil-2.8.1-py3.7.egg (from pandas==1.1.5->-r requirements.txt (line 48)) (2.8.1) Requirement already satisfied: pickleshare in /opt/conda/lib/python3.7/site-packages (from ipython->-r requirements.txt (line 10)) (0.7.5) Requirement already satisfied: matplotlib-inline in /opt/conda/lib/python3.7/site-packages (from ipython->-r requirements.txt (line 10)) (0.1.3) Requirement already satisfied: jedi>=0.16 in /opt/conda/lib/python3.7/site-packages (from ipython->-r requirements.txt (line 10)) (0.18.1) Requirement already satisfied: pygments in /opt/conda/lib/python3.7/site-packages (from ipython->-r requirements.txt (line 10)) (2.11.2) Requirement already satisfied: decorator in /opt/conda/lib/python3.7/site-packages (from ipython->-r requirements.txt (line 10)) (5.1.1) Requirement already satisfied: pexpect>4.3 in /opt/conda/lib/python3.7/site-packages (from ipython->-r requirements.txt (line 10)) (4.8.0) Requirement already satisfied: backcall in /opt/conda/lib/python3.7/site-packages (from ipython->-r requirements.txt (line 10)) (0.2.0) Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in /opt/conda/lib/python3.7/site-packages (from ipython->-r requirements.txt (line 10)) (3.0.28) Requirement already satisfied: traitlets>=4.2 in /opt/conda/lib/python3.7/site-packages (from ipython->-r requirements.txt (line 10)) (5.1.1) Requirement already satisfied: cycler>=0.10 in /opt/conda/lib/python3.7/site-packages/cycler-0.10.0-py3.7.egg (from matplotlib->-r requirements.txt (line 13)) (0.10.0) Requirement already satisfied: kiwisolver>=1.0.1 in /opt/conda/lib/python3.7/site-packages/kiwisolver-1.3.1-py3.7-linux-x86_64.egg (from matplotlib->-r requirements.txt (line 13)) (1.3.1) Requirement already satisfied: pyparsing>=2.2.1 in /opt/conda/lib/python3.7/site-packages/pyparsing-3.0.0b2-py3.7.egg (from matplotlib->-r requirements.txt (line 13)) (3.0.0b2) Requirement already satisfied: cryptography in /opt/conda/lib/python3.7/site-packages (from azure-storage-common~=2.1->azure-storage-blob==2.1.0->-r requirements.txt (line 4)) (3.4.7) Requirement already satisfied: six in /opt/conda/lib/python3.7/site-packages (from cycler>=0.10->matplotlib->-r requirements.txt (line 13)) (1.15.0) Requirement already satisfied: aiohttp in /opt/conda/lib/python3.7/site-packages (from fsspec[http]>=0.8.1->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (3.8.1) Requirement already satisfied: parso<0.9.0,>=0.8.0 in /opt/conda/lib/python3.7/site-packages (from jedi>=0.16->ipython->-r requirements.txt (line 10)) (0.8.3) Requirement already satisfied: ptyprocess>=0.5 in /opt/conda/lib/python3.7/site-packages (from pexpect>4.3->ipython->-r requirements.txt (line 10)) (0.7.0) Requirement already satisfied: wcwidth in /opt/conda/lib/python3.7/site-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->ipython->-r requirements.txt (line 10)) (0.2.5) Requirement already satisfied: google-auth<3,>=1.6.3 in /opt/conda/lib/python3.7/site-packages (from tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (2.6.0) Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in /opt/conda/lib/python3.7/site-packages (from tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (1.8.1) Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in /opt/conda/lib/python3.7/site-packages (from tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (0.4.6) Requirement already satisfied: tensorboard-data-server<0.7.0,>=0.6.0 in /opt/conda/lib/python3.7/site-packages (from tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (0.6.1) Requirement already satisfied: grpcio>=1.24.3 in /opt/conda/lib/python3.7/site-packages (from tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (1.44.0) Requirement already satisfied: werkzeug>=0.11.15 in /opt/conda/lib/python3.7/site-packages (from tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (2.0.3) Requirement already satisfied: wheel>=0.26 in /opt/conda/lib/python3.7/site-packages (from tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (0.35.1) Requirement already satisfied: markdown>=2.6.8 in /opt/conda/lib/python3.7/site-packages (from tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (3.3.6) Requirement already satisfied: absl-py>=0.4 in /opt/conda/lib/python3.7/site-packages (from tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (1.0.0) Requirement already satisfied: pyasn1-modules>=0.2.1 in /opt/conda/lib/python3.7/site-packages (from google-auth<3,>=1.6.3->tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (0.2.8) Requirement already satisfied: rsa<5,>=3.1.4 in /opt/conda/lib/python3.7/site-packages (from google-auth<3,>=1.6.3->tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (4.8) Requirement already satisfied: cachetools<6.0,>=2.0.0 in /opt/conda/lib/python3.7/site-packages (from google-auth<3,>=1.6.3->tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (5.0.0) Requirement already satisfied: requests-oauthlib>=0.7.0 in /opt/conda/lib/python3.7/site-packages (from google-auth-oauthlib<0.5,>=0.4.1->tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (1.3.1) Requirement already satisfied: zipp>=0.5 in /opt/conda/lib/python3.7/site-packages (from importlib-metadata->transformers==4.2.1->-r requirements.txt (line 38)) (3.7.0) Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /opt/conda/lib/python3.7/site-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (0.4.8) Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /opt/conda/lib/python3.7/site-packages (from requests->transformers==4.2.1->-r requirements.txt (line 38)) (1.25.11) Requirement already satisfied: idna<3,>=2.5 in /opt/conda/lib/python3.7/site-packages (from requests->transformers==4.2.1->-r requirements.txt (line 38)) (2.10) Requirement already satisfied: chardet<4,>=3.0.2 in /opt/conda/lib/python3.7/site-packages (from requests->transformers==4.2.1->-r requirements.txt (line 38)) (3.0.4) Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.7/site-packages (from requests->transformers==4.2.1->-r requirements.txt (line 38)) (2021.5.30) Requirement already satisfied: oauthlib>=3.0.0 in /opt/conda/lib/python3.7/site-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard>=2.2.0->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (3.2.0) Requirement already satisfied: azure-servicemanagement-legacy~=0.20.6 in /opt/conda/lib/python3.7/site-packages (from azure->-r requirements.txt (line 3)) (0.20.7) Requirement already satisfied: azure-keyvault~=0.3.3 in /opt/conda/lib/python3.7/site-packages (from azure->-r requirements.txt (line 3)) (0.3.7) Requirement already satisfied: azure-storage~=0.34.2 in /opt/conda/lib/python3.7/site-packages (from azure->-r requirements.txt (line 3)) (0.34.3) Requirement already satisfied: azure-datalake-store~=0.0.9 in /opt/conda/lib/python3.7/site-packages (from azure->-r requirements.txt (line 3)) (0.0.52) Requirement already satisfied: azure-mgmt~=1.0.0 in /opt/conda/lib/python3.7/site-packages (from azure->-r requirements.txt (line 3)) (1.0.0) Requirement already satisfied: azure-graphrbac~=0.30.0 in /opt/conda/lib/python3.7/site-packages (from azure->-r requirements.txt (line 3)) (0.30.0) Requirement already satisfied: azure-servicebus~=0.21.1 in /opt/conda/lib/python3.7/site-packages (from azure->-r requirements.txt (line 3)) (0.21.1) Requirement already satisfied: azure-batch~=3.0.0 in /opt/conda/lib/python3.7/site-packages (from azure->-r requirements.txt (line 3)) (3.0.0) Requirement already satisfied: azure-servicefabric~=5.6.130 in /opt/conda/lib/python3.7/site-packages (from azure->-r requirements.txt (line 3)) (5.6.130) Requirement already satisfied: msrestazure~=0.4.7 in /opt/conda/lib/python3.7/site-packages (from azure-batch~=3.0.0->azure->-r requirements.txt (line 3)) (0.4.34) Requirement already satisfied: azure-nspkg>=2.0.0 in /opt/conda/lib/python3.7/site-packages (from azure-batch~=3.0.0->azure->-r requirements.txt (line 3)) (3.0.2) Requirement already satisfied: cffi in /opt/conda/lib/python3.7/site-packages (from azure-datalake-store~=0.0.9->azure->-r requirements.txt (line 3)) (1.14.5) Requirement already satisfied: adal>=0.4.2 in /opt/conda/lib/python3.7/site-packages (from azure-datalake-store~=0.0.9->azure->-r requirements.txt (line 3)) (1.2.7) Requirement already satisfied: PyJWT<3,>=1.0.0 in /opt/conda/lib/python3.7/site-packages (from adal>=0.4.2->azure-datalake-store~=0.0.9->azure->-r requirements.txt (line 3)) (2.3.0) Requirement already satisfied: azure-mgmt-compute~=1.0.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (1.0.0) Requirement already satisfied: azure-mgmt-cognitiveservices~=1.0.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (1.0.0) Requirement already satisfied: azure-mgmt-scheduler~=1.1.2 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (1.1.3) Requirement already satisfied: azure-mgmt-documentdb~=0.1.3 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (0.1.3) Requirement already satisfied: azure-mgmt-keyvault~=0.31.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (0.31.0) Requirement already satisfied: azure-mgmt-web~=0.32.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (0.32.0) Requirement already satisfied: azure-mgmt-batch~=4.0.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (4.0.0) Requirement already satisfied: azure-mgmt-sql~=0.5.1 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (0.5.3) Requirement already satisfied: azure-mgmt-dns~=1.0.1 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (1.0.1) Requirement already satisfied: azure-mgmt-resource~=1.1.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (1.1.0) Requirement already satisfied: azure-mgmt-iothub~=0.2.2 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (0.2.2) Requirement already satisfied: azure-mgmt-network~=1.0.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (1.0.0) Requirement already satisfied: azure-mgmt-datalake-analytics~=0.1.4 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (0.1.6) Requirement already satisfied: azure-mgmt-trafficmanager~=0.30.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (0.30.0) Requirement already satisfied: azure-mgmt-authorization~=0.30.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (0.30.0) Requirement already satisfied: azure-mgmt-cdn~=0.30.3 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (0.30.3) Requirement already satisfied: azure-mgmt-datalake-store~=0.1.4 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (0.1.6) Requirement already satisfied: azure-mgmt-storage~=1.0.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (1.0.0) Requirement already satisfied: azure-mgmt-redis~=4.1.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (4.1.1) Requirement already satisfied: azure-mgmt-devtestlabs~=2.0.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (2.0.0) Requirement already satisfied: azure-mgmt-monitor~=0.2.1 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (0.2.1) Requirement already satisfied: azure-mgmt-containerregistry~=0.2.1 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (0.2.1) Requirement already satisfied: azure-mgmt-rdbms~=0.1.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (0.1.0) Requirement already satisfied: azure-mgmt-logic~=2.1.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (2.1.0) Requirement already satisfied: azure-mgmt-nspkg>=2.0.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt-authorization~=0.30.0->azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (3.0.2) Requirement already satisfied: azure-mgmt-datalake-nspkg>=2.0.0 in /opt/conda/lib/python3.7/site-packages (from azure-mgmt-datalake-analytics~=0.1.4->azure-mgmt~=1.0.0->azure->-r requirements.txt (line 3)) (3.0.1) Requirement already satisfied: pycparser in /opt/conda/lib/python3.7/site-packages (from cffi->azure-datalake-store~=0.0.9->azure->-r requirements.txt (line 3)) (2.20) Requirement already satisfied: msrest<2.0.0,>=0.4.28 in /opt/conda/lib/python3.7/site-packages (from msrestazure~=0.4.7->azure-batch~=3.0.0->azure->-r requirements.txt (line 3)) (0.6.21) Requirement already satisfied: keyring>=12.0.2 in /opt/conda/lib/python3.7/site-packages (from msrestazure~=0.4.7->azure-batch~=3.0.0->azure->-r requirements.txt (line 3)) (23.5.0) Requirement already satisfied: SecretStorage>=3.2 in /opt/conda/lib/python3.7/site-packages (from keyring>=12.0.2->msrestazure~=0.4.7->azure-batch~=3.0.0->azure->-r requirements.txt (line 3)) (3.3.1) Requirement already satisfied: jeepney>=0.4.2 in /opt/conda/lib/python3.7/site-packages (from keyring>=12.0.2->msrestazure~=0.4.7->azure-batch~=3.0.0->azure->-r requirements.txt (line 3)) (0.7.1) Requirement already satisfied: isodate>=0.6.0 in /opt/conda/lib/python3.7/site-packages (from msrest<2.0.0,>=0.4.28->msrestazure~=0.4.7->azure-batch~=3.0.0->azure->-r requirements.txt (line 3)) (0.6.1) Requirement already satisfied: sqlparse>=0.2.2 in /opt/conda/lib/python3.7/site-packages (from django->-r requirements.txt (line 6)) (0.4.2) Requirement already satisfied: asgiref<4,>=3.3.2 in /opt/conda/lib/python3.7/site-packages (from django->-r requirements.txt (line 6)) (3.5.0) Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/lib/python3.7/site-packages (from jinja2->-r requirements.txt (line 11)) (2.1.1) Requirement already satisfied: networkx>=2.2 in /opt/conda/lib/python3.7/site-packages (from scikit-image->-r requirements.txt (line 12)) (2.6.3) Requirement already satisfied: imageio>=2.4.1 in /opt/conda/lib/python3.7/site-packages (from scikit-image->-r requirements.txt (line 12)) (2.9.0) Requirement already satisfied: tifffile>=2019.7.26 in /opt/conda/lib/python3.7/site-packages (from scikit-image->-r requirements.txt (line 12)) (2021.11.2) Requirement already satisfied: scipy>=1.4.1 in /opt/conda/lib/python3.7/site-packages (from scikit-image->-r requirements.txt (line 12)) (1.7.3) Requirement already satisfied: PyWavelets>=1.1.1 in /opt/conda/lib/python3.7/site-packages (from scikit-image->-r requirements.txt (line 12)) (1.3.0) Requirement already satisfied: click in /opt/conda/lib/python3.7/site-packages (from nltk->-r requirements.txt (line 14)) (8.0.4) Requirement already satisfied: joblib in /opt/conda/lib/python3.7/site-packages (from nltk->-r requirements.txt (line 14)) (1.1.0) Requirement already satisfied: ppft>=1.6.6.4 in /opt/conda/lib/python3.7/site-packages (from pathos->-r requirements.txt (line 17)) (1.6.6.4) Requirement already satisfied: pox>=0.3.0 in /opt/conda/lib/python3.7/site-packages (from pathos->-r requirements.txt (line 17)) (0.3.0) Requirement already satisfied: dill>=0.3.4 in /opt/conda/lib/python3.7/site-packages (from pathos->-r requirements.txt (line 17)) (0.3.4) Requirement already satisfied: multiprocess>=0.70.12 in /opt/conda/lib/python3.7/site-packages (from pathos->-r requirements.txt (line 17)) (0.70.12.2) Requirement already satisfied: linecache2 in /opt/conda/lib/python3.7/site-packages (from traceback2->-r requirements.txt (line 26)) (1.0.0) Requirement already satisfied: scikit-learn in /opt/conda/lib/python3.7/site-packages (from sklearn->-r requirements.txt (line 29)) (1.0.2) Requirement already satisfied: s3transfer<0.6.0,>=0.5.0 in /opt/conda/lib/python3.7/site-packages (from boto3->-r requirements.txt (line 31)) (0.5.2) Requirement already satisfied: botocore<1.25.0,>=1.24.20 in /opt/conda/lib/python3.7/site-packages (from boto3->-r requirements.txt (line 31)) (1.24.20) Requirement already satisfied: jmespath<1.0.0,>=0.7.1 in /opt/conda/lib/python3.7/site-packages (from boto3->-r requirements.txt (line 31)) (0.10.0) Requirement already satisfied: tensorboardX in /opt/conda/lib/python3.7/site-packages (from pytorch_lamb->-r requirements.txt (line 34)) (2.5) Requirement already satisfied: llvmlite<0.39,>=0.38.0rc1 in /opt/conda/lib/python3.7/site-packages (from numba->-r requirements.txt (line 50)) (0.38.0) Requirement already satisfied: charset-normalizer<3.0,>=2.0 in /opt/conda/lib/python3.7/site-packages (from aiohttp->fsspec[http]>=0.8.1->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (2.0.12) Requirement already satisfied: frozenlist>=1.1.1 in /opt/conda/lib/python3.7/site-packages (from aiohttp->fsspec[http]>=0.8.1->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (1.3.0) Requirement already satisfied: yarl<2.0,>=1.0 in /opt/conda/lib/python3.7/site-packages (from aiohttp->fsspec[http]>=0.8.1->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (1.7.2) Requirement already satisfied: asynctest==0.13.0 in /opt/conda/lib/python3.7/site-packages (from aiohttp->fsspec[http]>=0.8.1->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (0.13.0) Requirement already satisfied: aiosignal>=1.1.2 in /opt/conda/lib/python3.7/site-packages (from aiohttp->fsspec[http]>=0.8.1->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (1.2.0) Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /opt/conda/lib/python3.7/site-packages (from aiohttp->fsspec[http]>=0.8.1->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (4.0.2) Requirement already satisfied: multidict<7.0,>=4.5 in /opt/conda/lib/python3.7/site-packages (from aiohttp->fsspec[http]>=0.8.1->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (6.0.2) Requirement already satisfied: attrs>=17.3.0 in /opt/conda/lib/python3.7/site-packages (from aiohttp->fsspec[http]>=0.8.1->pytorch_lightning==1.1.4->-r requirements.txt (line 37)) (21.4.0) Requirement already satisfied: gitdb<5,>=4.0.1 in /opt/conda/lib/python3.7/site-packages (from GitPython->sacred==0.8.2->-r requirements.txt (line 47)) (4.0.9) Requirement already satisfied: smmap<6,>=3.0.1 in /opt/conda/lib/python3.7/site-packages (from gitdb<5,>=4.0.1->GitPython->sacred==0.8.2->-r requirements.txt (line 47)) (5.0.0) Requirement already satisfied: threadpoolctl>=2.0.0 in /opt/conda/lib/python3.7/site-packages (from scikit-learn->sklearn->-r requirements.txt (line 29)) (3.1.0) WARNING: Running pip as root will break packages and permissions. You should install packages reliably by using venv: https://pip.pypa.io/warnings/venv + pip install deprecated Requirement already satisfied: deprecated in /opt/conda/lib/python3.7/site-packages (1.2.13) Requirement already satisfied: wrapt<2,>=1.10 in /opt/conda/lib/python3.7/site-packages (from deprecated) (1.14.0) WARNING: Running pip as root will break packages and permissions. You should install packages reliably by using venv: https://pip.pypa.io/warnings/venv + export NCCL_DEBUG=INFO + python -c import nltk; nltk.download("punkt") [nltk_data] Downloading package punkt to /root/nltk_data... [nltk_data] Package punkt is already up-to-date! + python -c import nltk; nltk.download("averaged_perceptron_tagger") [nltk_data] Downloading package averaged_perceptron_tagger to [nltk_data] /root/nltk_data... [nltk_data] Package averaged_perceptron_tagger is already up-to- [nltk_data] date! + sleep 5 03-16 04:42:17.890 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:49 cmd_run(): start to cmd run: ls -llh 03-16 04:42:17.890 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51 cmd_run(): ls 03-16 04:42:17.890 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51 cmd_run(): -llh total 4.3M -rw-rw-r-- 1 root root 4.5K Feb 4 2021 aml_job_config.json drwxr-xr-x 6 root root 4.0K Feb 5 2021 aux_data -rw-rw-r-- 1 root root 24K Jul 26 2021 CLIPS.ipynb -rw-rw-r-- 1 root root 20 Jun 9 2021 README.md -rw-rw-r-- 1 root root 627 Mar 11 17:55 requirements.txt drwxr-xr-x 2 root root 4.0K Apr 12 2021 scripts drwxrwxr-x 5 root root 4.0K Sep 23 00:25 src -rw-rw-r-- 1 root root 31K Nov 10 00:06 stats.pdf -rw-rw-r-- 1 root root 17K Feb 1 23:51 T5_test.ipynb drwxrwxr-x 4 root root 4.0K Dec 13 2020 tools -rw-rw-r-- 1 root root 3.3M Nov 10 00:06 Untitled.ipynb -rw-rw-r-- 1 root root 69K Nov 9 23:02 vinvl_label.json -rw-rw-r-- 1 root root 577K Nov 16 22:18 Visualization.ipynb -rw-rw-r-- 1 root root 3.4K Sep 2 2021 visualize.py 03-16 04:42:17.895 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:49 cmd_run(): start to cmd run: pip freeze 03-16 04:42:17.895 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51 cmd_run(): pip 03-16 04:42:17.895 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51 cmd_run(): freeze absl-py==1.0.0 adal==1.2.7 aiohttp==3.8.1 aiosignal==1.2.0 anytree==2.8.0 apex==0.1 asgiref==3.5.0 async-timeout==4.0.2 asynctest==0.13.0 attrs==21.4.0 azure==2.0.0 azure-batch==3.0.0 azure-common==1.1.28 azure-datalake-store==0.0.52 azure-graphrbac==0.30.0 azure-keyvault==0.3.7 azure-mgmt==1.0.0 azure-mgmt-authorization==0.30.0 azure-mgmt-batch==4.0.0 azure-mgmt-cdn==0.30.3 azure-mgmt-cognitiveservices==1.0.0 azure-mgmt-compute==1.0.0 azure-mgmt-containerregistry==0.2.1 azure-mgmt-datalake-analytics==0.1.6 azure-mgmt-datalake-nspkg==3.0.1 azure-mgmt-datalake-store==0.1.6 azure-mgmt-devtestlabs==2.0.0 azure-mgmt-dns==1.0.1 azure-mgmt-documentdb==0.1.3 azure-mgmt-iothub==0.2.2 azure-mgmt-keyvault==0.31.0 azure-mgmt-logic==2.1.0 azure-mgmt-monitor==0.2.1 azure-mgmt-network==1.0.0 azure-mgmt-nspkg==3.0.2 azure-mgmt-rdbms==0.1.0 azure-mgmt-redis==4.1.1 azure-mgmt-resource==1.1.0 azure-mgmt-scheduler==1.1.3 azure-mgmt-sql==0.5.3 azure-mgmt-storage==1.0.0 azure-mgmt-trafficmanager==0.30.0 azure-mgmt-web==0.32.0 azure-nspkg==3.0.2 azure-servicebus==0.21.1 azure-servicefabric==5.6.130 azure-servicemanagement-legacy==0.20.7 azure-storage==0.34.3 azure-storage-blob==2.1.0 azure-storage-common==2.1.0 backcall==0.2.0 boto3==1.21.20 botocore==1.24.20 brotlipy==0.7.0 cachetools==5.0.0 certifi==2021.5.30 cffi @ file:///tmp/build/80754af9/cffi_1613246939562/work chardet @ file:///tmp/build/80754af9/chardet_1605303159953/work charset-normalizer==2.0.12 click==8.0.4 colorama==0.4.4 conda==4.10.1 conda-package-handling @ file:///tmp/build/80754af9/conda-package-handling_1618262151086/work cryptography @ file:///tmp/build/80754af9/cryptography_1616769182610/work cycler==0.10.0 Cython==0.29.23 dataclasses==0.6 decorator==5.1.1 Deprecated==1.2.13 dill==0.3.4 Django==3.2.12 docopt==0.6.2 easydict==1.9 einops==0.3.0 ete3==3.1.2 fairscale==0.4.2 filelock==3.6.0 frozenlist==1.3.0 fsspec==2022.2.0 future==0.18.2 gitdb==4.0.9 GitPython==3.1.27 google-auth==2.6.0 google-auth-oauthlib==0.4.6 grpcio==1.44.0 idna @ file:///tmp/build/80754af9/idna_1593446292537/work imageio==2.9.0 importlib-metadata==4.11.3 ipdb==0.13.4 ipython==7.32.0 isodate==0.6.1 jedi==0.18.1 jeepney==0.7.1 Jinja2==3.0.3 jmespath==0.10.0 joblib==1.1.0 jsonpickle==1.5.2 keyring==23.5.0 kiwisolver==1.3.1 kmeans-pytorch==0.3 linecache2==1.0.0 llvmlite==0.38.0 Markdown==3.3.6 MarkupSafe==2.1.1 matplotlib==3.4.2 matplotlib-inline==0.1.3 mkl-fft==1.3.0 mkl-random @ file:///tmp/build/80754af9/mkl_random_1618853974840/work mkl-service==2.3.0 msrest==0.6.21 msrestazure==0.4.34 multidict==6.0.2 multiprocess==0.70.12.2 munch==2.5.0 networkx==2.6.3 ninja==1.10.0.post2 nltk==3.7 numba==0.55.1 numpy==1.20.0 oauthlib==3.2.0 olefile==0.46 opencv-python==4.5.5.64 orderedset==2.0.3 packaging==21.3 pandas==1.1.5 parso==0.8.3 pathos==0.2.8 pexpect==4.8.0 pickleshare==0.7.5 Pillow==8.1.0 pox==0.3.0 ppft==1.6.6.4 progressbar==2.5 prompt-toolkit==3.0.28 protobuf==3.19.4 psutil==5.9.0 ptyprocess==0.7.0 py-cpuinfo==8.0.0 pyarrow==2.0.0 pyasn1==0.4.8 pyasn1-modules==0.2.8 pycocotools==2.0 pycosat==0.6.3 pycparser @ file:///tmp/build/80754af9/pycparser_1594388511720/work Pygments==2.11.2 PyJWT==2.3.0 pymongo==4.0.2 pyOpenSSL @ file:///tmp/build/80754af9/pyopenssl_1605545627475/work pyparsing==3.0.0b2 PySocks @ file:///tmp/build/80754af9/pysocks_1594394576006/work python-dateutil==2.8.1 python-magic==0.4.25 pytorch-lamb==1.0.0 pytorch-lightning==1.1.4 pytz==2021.3 PyWavelets==1.3.0 PyYAML==5.4.1 regex==2022.3.15 requests @ file:///tmp/build/80754af9/requests_1592841827918/work requests-oauthlib==1.3.1 rsa==4.8 ruamel-yaml-conda @ file:///tmp/build/80754af9/ruamel_yaml_1616016701961/work s3transfer==0.5.2 sacred==0.8.2 sacremoses==0.0.49 scikit-image==0.19.2 scikit-learn==1.0.2 scipy==1.7.3 SecretStorage==3.3.1 simplejson==3.17.6 six @ file:///tmp/build/80754af9/six_1605205313296/work sklearn==0.0 smmap==5.0.0 sqlparse==0.4.2 tb-nightly==2.9.0a20220313 tensorboard==2.8.0 tensorboard-data-server==0.6.1 tensorboard-plugin-wit==1.8.1 tensorboardX==2.5 threadpoolctl==3.1.0 tifffile==2021.11.2 timm @ git+https://github.com/rwightman/pytorch-image-models.git@7c67d6aca992f039eece0af5f7c29a43d48c00e4 tokenizers==0.9.4 torch==1.7.0 torchlars==0.1.2 torchvision==0.8.0 tqdm==4.56.0 traceback2==1.4.0 traitlets==5.1.1 transformers==4.2.1 typing-extensions @ file:///home/ktietz/src/ci_mi/typing_extensions_1612808209620/work urllib3 @ file:///tmp/build/80754af9/urllib3_1603305693037/work wcwidth==0.2.5 Werkzeug==2.0.3 wrapt==1.14.0 yacs==0.1.8 yarl==1.7.2 zipp==3.7.0 03-16 04:42:18.312 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:229 wrap_all(): python src/qd/pipeline.py -c ./aux_data/Jacob_config/new_result/coco-caption_no_vlp/classifier_Embedding/VinVL_Label_60_epoch_0.1_lrreduc/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1.yaml 03-16 04:42:18.320 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:49 cmd_run(): start to cmd run: python src/qd/pipeline.py -c ./aux_data/Jacob_config/new_result/coco-caption_no_vlp/classifier_Embedding/VinVL_Label_60_epoch_0.1_lrreduc/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1.yaml 03-16 04:42:18.320 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51 cmd_run(): python 03-16 04:42:18.321 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51 cmd_run(): src/qd/pipeline.py 03-16 04:42:18.321 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51 cmd_run(): -c 03-16 04:42:18.321 42c07f8197104c3b988e50758ff54da200000C 249 aml_server.py:51 cmd_run(): ./aux_data/Jacob_config/new_result/coco-caption_no_vlp/classifier_Embedding/VinVL_Label_60_epoch_0.1_lrreduc/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1.yaml 03-16 04:42:18.321 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 04:42:18.321 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 04:42:19.000 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 0, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 0, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 0, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 0, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 0, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 0, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 0, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 0, 'mem_total': 32510, 'gpu_util': 0}] 2022-03-16 04:42:20,067.067 2829:qd_common.py:1742 setup_yaml(): python 3 env 2022-03-16 04:42:20,068.068 2829:qd_common.py:1105 parse_general_args(): loading parameter from ./aux_data/Jacob_config/new_result/coco-caption_no_vlp/classifier_Embedding/VinVL_Label_60_epoch_0.1_lrreduc/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1.yaml 2022-03-16 04:42:20,074.074 2829:pipeline.py:1365 (): param: {'all_test_data': [{'test_data': 'TaxCocoCaption', 'test_split': 'test'}], 'param': {'add_od_labels': True, 'base_lr': 0.0001, 'basemodel': './output/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_20_epoch_lr_5e-5_BS_1024_loss_focal_crop_0.08_bert_category/snapshot/model_iter_0081989.pt', 'category': 'bert', 'crop_pct': 1.0, 'data': 'TaxCocoCaption', 'drop_out': 0, 'effective_batch_size': 512, 'expid': 'Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb', 'expid_prefix': 'CAPU', 'force_predict': True, 'force_train': True, 'full_expid': 'Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb', 'ignore_project_image': True, 'image_encoder_pretrained': True, 'image_encoder_type': 'VitEmb_vit_base_patch16_384', 'img_feature_dim': 2054, 'input_small_scale': 0.08, 'log_step': 100, 'loss': 'focal', 'lr_multiplier': 0.1, 'mask_type': 'seq2seq', 'max_img_seq_length': 0, 'max_iter': '60e', 'max_seq_a_length': 20, 'max_seq_length': 70, 'monitor_after': True, 'multi_crop': False, 'multi_crop_scale': False, 'multi_scale': False, 'net': 'B', 'od_label_conf': 0.2, 'pad_to_max': True, 'pipeline_type': {'from': 'src.qd.pipelines.tagger_caption_uni_pipeline_expanding', 'import': 'CaptionUniPipeline'}, 'split_blocks': 4, 'tagemb': 'cls', 'test_batch_size': 16, 'test_crop_size': 384, 'text_encoder_type': './aux_data/untrained_config/VILT-L12-H784-uncased_16_384', 'tokenizer_file': 'vinvl_label.json', 'topk': 50, 'train_crop_size': 384, 'train_label_version': 'vinvl', 'train_transform': 'vit', 'use_amp': False, 'use_img_layernorm': False, 'weight_decay': 0.05}, 'type': 'pipeline_train_eval_multi'} 2022-03-16 04:42:20,235.235 2829:qd_common.py:3452 print_frame_info(): func name = pipeline_train_eval_multi; all_test_data = [{'test_data': 'TaxCocoCaption', 'test_split': 'test'}]; param = {'data': 'TaxCocoCaption', 'drop_out': 0, 'net': 'B', 'mask_type': 'seq2seq', 'tokenizer_file': 'vinvl_label.json', 'topk': 50, 'basemodel': './output/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_20_epoch_lr_5e-5_BS_1024_loss_focal_crop_0.08_bert_category/snapshot/model_iter_0081989.pt', 'text_encoder_type': './aux_data/untrained_config/VILT-L12-H784-uncased_16_384', 'image_encoder_type': 'VitEmb_vit_base_patch16_384', 'crop_pct': 1.0, 'base_lr': 0.0001, 'split_blocks': 4, 'lr_multiplier': 0.1, 'monitor_after': True, 'test_crop_size': 384, 'train_crop_size': 384, 'multi_scale': False, 'multi_crop': False, 'multi_crop_scale': False, 'train_transform': 'vit', 'use_img_layernorm': False, 'expid': 'Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb', 'full_expid': 'Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb', 'image_encoder_pretrained': True, 'expid_prefix': 'CAPU', 'pad_to_max': True, 'add_od_labels': True, 'effective_batch_size': 512, 'test_batch_size': 16, 'max_iter': '60e', 'ignore_project_image': True, 'input_small_scale': 0.08, 'log_step': 100, 'weight_decay': 0.05, 'use_amp': False, 'tagemb': 'cls', 'max_img_seq_length': 0, 'od_label_conf': 0.2, 'max_seq_length': 70, 'max_seq_a_length': 20, 'img_feature_dim': 2054, 'train_label_version': 'vinvl', 'loss': 'focal', 'category': 'bert', 'force_train': True, 'force_predict': True, 'pipeline_type': {'from': 'src.qd.pipelines.tagger_caption_uni_pipeline_expanding', 'import': 'CaptionUniPipeline'}} 2022-03-16 04:42:20,235.235 2829:qd_common.py:1742 setup_yaml(): python 3 env 2022-03-16 04:42:26,690.690 2829:torch_common.py:408 ensure_init_process_group(): {'backend': 'nccl', 'init_method': 'tcp://10.0.0.8:12345', 'rank': 0, 'world_size': 32, 'timeout': datetime.timedelta(days=10)} 42c07f8197104c3b988e50758ff54da200000C:2829:2829 [0] NCCL INFO Bootstrap : Using [0]eth0:10.0.0.8<0> 42c07f8197104c3b988e50758ff54da200000C:2829:2829 [0] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so), using internal implementation 42c07f8197104c3b988e50758ff54da200000C:2829:2829 [0] NCCL INFO NCCL_IB_DISABLE set by environment to 0. libibverbs: Warning: couldn't load driver 'libvmw_pvrdma-rdmav25.so': libvmw_pvrdma-rdmav25.so: cannot open shared object file: No such file or directory libibverbs: Warning: couldn't load driver 'libmthca-rdmav25.so': libmthca-rdmav25.so: cannot open shared object file: No such file or directory libibverbs: Warning: couldn't load driver 'libhfi1verbs-rdmav25.so': libhfi1verbs-rdmav25.so: cannot open shared object file: No such file or directory libibverbs: Warning: couldn't load driver 'libi40iw-rdmav25.so': libi40iw-rdmav25.so: cannot open shared object file: No such file or directory libibverbs: Warning: couldn't load driver 'libqedr-rdmav25.so': libqedr-rdmav25.so: cannot open shared object file: No such file or directory libibverbs: Warning: couldn't load driver 'libcxgb4-rdmav25.so': libcxgb4-rdmav25.so: cannot open shared object file: No such file or directory libibverbs: Warning: couldn't load driver 'libocrdma-rdmav25.so': libocrdma-rdmav25.so: cannot open shared object file: No such file or directory libibverbs: Warning: couldn't load driver 'libipathverbs-rdmav25.so': libipathverbs-rdmav25.so: cannot open shared object file: No such file or directory libibverbs: Warning: couldn't load driver 'libhns-rdmav25.so': libhns-rdmav25.so: cannot open shared object file: No such file or directory libibverbs: Warning: couldn't load driver 'libbnxt_re-rdmav25.so': libbnxt_re-rdmav25.so: cannot open shared object file: No such file or directory 42c07f8197104c3b988e50758ff54da200000C:2829:2829 [0] NCCL INFO NET/IB : Using [0]mlx5_ib0:1/IB ; OOB eth0:10.0.0.8<0> 42c07f8197104c3b988e50758ff54da200000C:2829:2829 [0] NCCL INFO Using network IB NCCL version 2.7.8+cuda10.2 42c07f8197104c3b988e50758ff54da200000C:2829:3282 [0] NCCL INFO NCCL_IB_TIMEOUT set by environment to 32. 42c07f8197104c3b988e50758ff54da200000C:2829:3282 [0] NCCL INFO Channel 00/02 : 0 1 2 4 7 6 5 3 8 9 10 12 15 14 13 11 16 17 18 20 42c07f8197104c3b988e50758ff54da200000C:2829:3282 [0] NCCL INFO Channel 01/02 : 0 1 2 4 7 6 5 3 8 9 10 12 15 14 13 11 16 17 18 20 42c07f8197104c3b988e50758ff54da200000C:2829:3282 [0] NCCL INFO threadThresholds 8/8/64 | 256/8/64 | 8/8/64 42c07f8197104c3b988e50758ff54da200000C:2829:3282 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1|-1->0->1/-1/-1 [1] 1/-1/-1->0->25|25->0->1/-1/-1 42c07f8197104c3b988e50758ff54da200000C:2829:3282 [0] NCCL INFO Setting affinity for GPU 0 to 0fffff 42c07f8197104c3b988e50758ff54da200000C:2829:3282 [0] NCCL INFO Channel 00 : 27[400000] -> 0[100000] [receive] via NET/IB/0 42c07f8197104c3b988e50758ff54da200000C:2829:3282 [0] NCCL INFO Channel 00 : 0[100000] -> 1[200000] via P2P/IPC 42c07f8197104c3b988e50758ff54da200000C:2829:3282 [0] NCCL INFO Channel 01 : 27[400000] -> 0[100000] [receive] via NET/IB/0 42c07f8197104c3b988e50758ff54da200000C:2829:3282 [0] NCCL INFO Channel 01 : 0[100000] -> 1[200000] via P2P/IPC 42c07f8197104c3b988e50758ff54da200000C:2829:3282 [0] NCCL INFO Channel 01 : 0[100000] -> 25[200000] [send] via NET/IB/0 42c07f8197104c3b988e50758ff54da200000C:2829:3282 [0] NCCL INFO Channel 01 : 25[200000] -> 0[100000] [receive] via NET/IB/0 42c07f8197104c3b988e50758ff54da200000C:2829:3282 [0] NCCL INFO 2 coll channels, 2 p2p channels, 1 p2p channels per peer 42c07f8197104c3b988e50758ff54da200000C:2829:3282 [0] NCCL INFO comm 0x7fcb40001060 rank 0 nranks 32 cudaDev 0 busId 100000 - Init COMPLETE 42c07f8197104c3b988e50758ff54da200000C:2829:2829 [0] NCCL INFO Launch mode Parallel 2022-03-16 04:43:35,149.149 2829:uni_pipeline.py:841 _ensure_initialized(): initialized 2022-03-16 04:43:35,157.157 2829:uni_pipeline.py:534 ensure_train(): last model file = output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt 2022-03-16 04:43:35,521.521 2829:uni_pipeline.py:542 ensure_train(): {'add_od_labels': True, 'apply_nms_gt': True, 'base_lr': 0.0001, 'basemodel': './output/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_20_epoch_lr_5e-5_BS_1024_loss_focal_crop_0.08_bert_category/snapshot/model_iter_0081989.pt', 'bgr2rgb': False, 'bias_no_weight_decay': True, 'category': 'bert', 'cider_cached_tokens': 'data/coco_caption/gt/coco-train-words.p', 'coco_eval_max_det': 100, 'cosine_restart_after_warmup': True, 'cosine_warmup_factor': 0.3333333333333333, 'cosine_warmup_iters': 500, 'crop_pct': 1.0, 'cudnn_benchmark': False, 'cutout_factor': 4, 'data': 'TaxCocoCaption', 'device': 'cuda', 'dist_backend': 'nccl', 'dist_url_tcp_port': 12345, 'dist_weight': 1.0, 'drop_out': 0, 'effective_batch_size': 512, 'evaluate_method': 'map', 'expid': 'Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb', 'expid_prefix': 'CAPU', 'find_unused_parameters': True, 'force_predict': True, 'force_train': True, 'full_expid': 'Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb', 'gradient_clip': 1.0, 'ignore_project_image': True, 'image_encoder_pretrained': True, 'image_encoder_type': 'VitEmb_vit_base_patch16_384', 'img_feature_dim': 2054, 'img_layer_norm_eps': 1e-05, 'init_method_type': 'tcp', 'input_small_scale': 0.08, 'label_smoothing': 0.1, 'ln_no_weight_decay': True, 'log_step': 100, 'loss': 'focal', 'lr_multiplier': 0.1, 'mask_prob': 0.15, 'mask_type': 'seq2seq', 'max_gen_length': 20, 'max_img_seq_length': 0, 'max_iter': '60e', 'max_masked_tokens': 3, 'max_seq_a_length': 20, 'max_seq_length': 70, 'min_rel_lr_in_cosine': 0.0, 'mobilenetv3_dropout_ratio': 0.2, 'momentum': 0.9, 'monitor_after': True, 'multi_crop': False, 'multi_crop_scale': False, 'multi_scale': False, 'net': 'B', 'no_sort_by_conf': False, 'num_beams': 1, 'num_workers': 8, 'od_label_conf': 0.2, 'od_label_conf ': 0.2, 'optimizer_type': 'MAdamW', 'output_isvalid': False, 'ovthresh': [-1], 'pad_to_max': True, 'pert_img_prob': None, 'pipeline_type': {'from': 'src.qd.pipelines.tagger_caption_uni_pipeline_expanding', 'import': 'CaptionUniPipeline'}, 'pred_tsv_to_json_extra': 1, 'random_seed': 88, 'real_text_a_in_test': False, 'replace_by_mask_prob': 0.8, 'replace_by_rand_prob': 0.1, 'rms_alpha': 0.99, 'scheduler_type': 'linear', 'smooth_label_eps': 0.1, 'snapshot_steps': 5000, 'split_blocks': 4, 'splitbysplitsample_buffer_size': 1, 'splitbysplitsample_group_size': 1, 'step_lr': 30, 'tagemb': 'cls', 'temperature': 1, 'test_batch_size': 16, 'test_crop_size': 384, 'test_data': 'TaxCocoCaption', 'test_mergebn': False, 'test_split': 'test', 'text_encoder_type': './aux_data/untrained_config/VILT-L12-H784-uncased_16_384', 'tie_weights': True, 'tokenizer_file': 'vinvl_label.json', 'top_k': 0, 'top_p': 1, 'topk': 50, 'train_crop_size': 384, 'train_label_version': 'vinvl', 'train_shuffle': True, 'train_transform': 'vit', 'unique_labels_on': False, 'use_amp': False, 'use_img_layernorm': False, 'warmup_steps': 0, 'weight_decay': 0.05} 2022-03-16 04:43:35,522.522 2829:uni_pipeline.py:545 ensure_train(): torch info = {'cuda': '10.2', 'cudnn': 7605, 'current_device': 0, 'device_count': 8, 'nccl': 2708, 'version': '1.7.0'} 2022-03-16 04:43:35,621.621 2829:modeling_utils.py:187 from_pretrained(): loading configuration file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/config.json 2022-03-16 04:43:35,621.621 2829:modeling_utils.py:211 from_pretrained(): Model config { "attention_probs_dropout_prob": 0.1, "finetuning_task": "image_captioning", "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "TIMM_vit", "net": "vit_base_patch16_384", "num_attention_heads": 12, "num_hidden_layers": 12, "num_labels": 2, "output_attentions": false, "output_hidden_states": false, "pretrained": true, "torchscript": false, "type_vocab_size": 2, "vocab_size": 30522 } 2022-03-16 04:43:35,624.624 2829:tokenization_utils.py:170 _from_pretrained(): Model name './aux_data/untrained_config/VILT-L12-H784-uncased_16_384' not found in model shortcut name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese, bert-base-german-cased, bert-large-uncased-whole-word-masking, bert-large-cased-whole-word-masking, bert-large-uncased-whole-word-masking-finetuned-squad, bert-large-cased-whole-word-masking-finetuned-squad, bert-base-cased-finetuned-mrpc). Assuming './aux_data/untrained_config/VILT-L12-H784-uncased_16_384' is a path or url to a directory containing tokenizer files. 2022-03-16 04:43:35,624.624 2829:tokenization_utils.py:180 _from_pretrained(): Didn't find file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/added_tokens.json. We won't load it. 2022-03-16 04:43:35,624.624 2829:tokenization_utils.py:180 _from_pretrained(): Didn't find file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/special_tokens_map.json. We won't load it. 2022-03-16 04:43:35,624.624 2829:tokenization_utils.py:214 _from_pretrained(): loading file None 2022-03-16 04:43:35,624.624 2829:tokenization_utils.py:214 _from_pretrained(): loading file None 2022-03-16 04:43:35,624.624 2829:tokenization_utils.py:214 _from_pretrained(): loading file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/vocab.txt --- Logging error --- Traceback (most recent call last): File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit msg = self.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format return fmt.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format record.message = record.getMessage() File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage msg = str(self.msg) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__ return str(self.to_json_string()) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n" File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps **kw).encode(obj) File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode chunks = list(chunks) File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode yield from _iterencode_dict(o, _current_indent_level) File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict yield from chunks File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode o = _default(o) File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default raise TypeError(f'Object of type {o.__class__.__name__} ' TypeError: Object of type BertTokenizer is not JSON serializable Call stack: File "src/qd/pipeline.py", line 1368, in locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 620, in pipeline_train_eval_multi pip.ensure_train() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 550, in ensure_train train_result = self.train() File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 605, in train model = self.get_model(is_train=True) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. --- Logging error --- Traceback (most recent call last): File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit msg = self.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format return fmt.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format record.message = record.getMessage() File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage msg = str(self.msg) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__ return str(self.to_json_string()) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n" File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps **kw).encode(obj) File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode chunks = list(chunks) File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode yield from _iterencode_dict(o, _current_indent_level) File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict yield from chunks File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode o = _default(o) File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default raise TypeError(f'Object of type {o.__class__.__name__} ' TypeError: Object of type BertTokenizer is not JSON serializable Call stack: File "src/qd/pipeline.py", line 1368, in locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 620, in pipeline_train_eval_multi pip.ensure_train() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 550, in ensure_train train_result = self.train() File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 605, in train model = self.get_model(is_train=True) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. 2022-03-16 04:43:36,373.373 2829:modeling_bert.py:529 __init__(): TIMM Split image encoder load from pre-trained: True 2022-03-16 04:43:37,526.526 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-16 04:43:42,387.387 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-16 04:43:43,023.023 2829:modeling_bert.py:2677 __init__(): BertImgModel Image Dimension: 2054 2022-03-16 04:43:43,176.176 2829:modeling_bert.py:2688 __init__(): BertImgModel Image Dimension: 2054 2022-03-16 04:43:43,972.972 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight! Note that this might be replaced by pre-trained checkpoint later! 2022-03-16 04:43:43,972.972 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode. 2022-03-16 04:43:45,257.257 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-16 04:43:46,254.254 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): module.cls_token: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,255.255 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): module.pos_embed: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,255.255 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): module.patch_embed.proj.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,255.255 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): module.patch_embed.proj.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,255.255 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): module.head.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,255.255 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): module.head.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,255.255 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): word_embeddings.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,256.256 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): position_embeddings.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,256.256 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): token_type_embeddings.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,256.256 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): LayerNorm.weight: lr = 0.0001; weight_decay = 0 2022-03-16 04:43:46,256.256 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): LayerNorm.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,256.256 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.norm1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,256.256 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.norm1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,256.256 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,257.257 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,257.257 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.attn.proj.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,257.257 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.attn.proj.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,257.257 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.norm2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,257.257 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.norm2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,257.257 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,257.257 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,257.257 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,257.257 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,257.257 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.norm1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,257.257 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.norm1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,258.258 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,258.258 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,258.258 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.attn.proj.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,258.258 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.attn.proj.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,258.258 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.norm2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,258.258 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.norm2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,258.258 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,258.258 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,258.258 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,259.259 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,259.259 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.norm1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,259.259 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.norm1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,259.259 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,259.259 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,259.259 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.attn.proj.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,259.259 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.attn.proj.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,259.259 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.norm2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,259.259 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.norm2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,259.259 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,259.259 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,260.260 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,260.260 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,260.260 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.norm1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,260.260 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.norm1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,260.260 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,260.260 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,260.260 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.attn.proj.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,260.260 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.attn.proj.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,260.260 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.norm2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,260.260 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.norm2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,261.261 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,261.261 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,261.261 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,261.261 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,261.261 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 4.norm1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,261.261 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 4.norm1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,261.261 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 4.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,261.261 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 4.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,261.261 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 4.attn.proj.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,261.261 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 4.attn.proj.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,262.262 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 4.norm2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,262.262 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 4.norm2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,262.262 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 4.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,262.262 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 4.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,262.262 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 4.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,262.262 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 4.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,262.262 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 5.norm1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,262.262 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 5.norm1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,262.262 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 5.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,262.262 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 5.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,262.262 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 5.attn.proj.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,263.263 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 5.attn.proj.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,263.263 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 5.norm2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,263.263 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 5.norm2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,263.263 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 5.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,263.263 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 5.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,263.263 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 5.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,263.263 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 5.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,263.263 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 6.norm1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,263.263 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 6.norm1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,263.263 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 6.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,264.264 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 6.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,264.264 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 6.attn.proj.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,264.264 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 6.attn.proj.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,264.264 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 6.norm2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,264.264 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 6.norm2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,264.264 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 6.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,264.264 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 6.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,264.264 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 6.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,264.264 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 6.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,264.264 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 7.norm1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,265.265 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 7.norm1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,265.265 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 7.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,265.265 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 7.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,265.265 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 7.attn.proj.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,265.265 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 7.attn.proj.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,265.265 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 7.norm2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,265.265 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 7.norm2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,265.265 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 7.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,265.265 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 7.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,265.265 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 7.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,266.266 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 7.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,266.266 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.norm1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,266.266 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.norm1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,266.266 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,266.266 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,266.266 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.attn.proj.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,266.266 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.attn.proj.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,266.266 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.norm2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,266.266 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.norm2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,267.267 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,267.267 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,267.267 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,267.267 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,267.267 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.norm1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,267.267 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.norm1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,267.267 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,267.267 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,267.267 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.attn.proj.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,267.267 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.attn.proj.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,268.268 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.norm2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,268.268 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.norm2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,268.268 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,268.268 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,268.268 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,268.268 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,268.268 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.norm1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,268.268 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.norm1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,268.268 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,268.268 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,268.268 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.attn.proj.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,269.269 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.attn.proj.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,269.269 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.norm2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,269.269 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.norm2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,269.269 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,269.269 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,269.269 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,269.269 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,269.269 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.norm1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,269.269 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.norm1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,269.269 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,270.270 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,270.270 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.attn.proj.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,270.270 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.attn.proj.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,270.270 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.norm2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,270.270 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.norm2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,270.270 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,270.270 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,270.270 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,270.270 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,270.270 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.norm1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,271.271 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.norm1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,271.271 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,271.271 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,271.271 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.attn.proj.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,271.271 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.attn.proj.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,271.271 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.norm2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,271.271 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.norm2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,271.271 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,271.271 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,271.271 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,272.272 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 0.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,272.272 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.norm1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,272.272 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.norm1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,272.272 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,272.272 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,272.272 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.attn.proj.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,272.272 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.attn.proj.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,272.272 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.norm2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,272.272 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.norm2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,273.273 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,273.273 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,273.273 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,273.273 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 1.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,273.273 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.norm1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,273.273 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.norm1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,273.273 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,273.273 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,273.273 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.attn.proj.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,273.273 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.attn.proj.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,273.273 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.norm2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,274.274 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.norm2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,274.274 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,274.274 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,274.274 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,274.274 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 2.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,274.274 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.norm1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,274.274 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.norm1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,274.274 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.attn.qkv.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,274.274 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.attn.qkv.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,274.274 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.attn.proj.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,275.275 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.attn.proj.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,275.275 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.norm2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,275.275 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.norm2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,275.275 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.mlp.fc1.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,275.275 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.mlp.fc1.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,275.275 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.mlp.fc2.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,275.275 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): 3.mlp.fc2.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,275.275 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): dense.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,275.275 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): dense.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,275.275 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): dense.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,276.276 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): dense.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,276.276 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): predictions.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,276.276 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): predictions.transform.dense.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,276.276 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): predictions.transform.dense.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,276.276 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): predictions.transform.LayerNorm.weight: lr = 0.0001; weight_decay = 0 2022-03-16 04:43:46,276.276 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): predictions.transform.LayerNorm.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,276.276 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): predictions.decoder.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,276.276 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.attention.self.query.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,276.276 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.attention.self.query.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,277.277 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.attention.self.key.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,277.277 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.attention.self.key.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,277.277 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.attention.self.value.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,277.277 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.attention.self.value.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,277.277 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.attention.output.dense.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,277.277 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.attention.output.dense.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,277.277 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.attention.output.LayerNorm.weight: lr = 0.0001; weight_decay = 0 2022-03-16 04:43:46,277.277 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.attention.output.LayerNorm.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,277.277 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.intermediate.dense.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,277.277 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.intermediate.dense.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,278.278 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.output.dense.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,278.278 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.output.dense.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,278.278 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.output.LayerNorm.weight: lr = 0.0001; weight_decay = 0 2022-03-16 04:43:46,278.278 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.0.output.LayerNorm.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,278.278 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.attention.self.query.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,278.278 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.attention.self.query.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,278.278 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.attention.self.key.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,278.278 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.attention.self.key.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,278.278 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.attention.self.value.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,278.278 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.attention.self.value.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,278.278 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.attention.output.dense.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,279.279 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.attention.output.dense.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,279.279 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.attention.output.LayerNorm.weight: lr = 0.0001; weight_decay = 0 2022-03-16 04:43:46,279.279 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.attention.output.LayerNorm.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,279.279 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.intermediate.dense.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,279.279 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.intermediate.dense.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,279.279 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.output.dense.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,279.279 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.output.dense.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,279.279 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.output.LayerNorm.weight: lr = 0.0001; weight_decay = 0 2022-03-16 04:43:46,279.279 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.1.output.LayerNorm.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,279.279 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.attention.self.query.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,280.280 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.attention.self.query.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,280.280 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.attention.self.key.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,280.280 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.attention.self.key.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,280.280 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.attention.self.value.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,280.280 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.attention.self.value.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,280.280 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.attention.output.dense.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,280.280 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.attention.output.dense.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,280.280 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.attention.output.LayerNorm.weight: lr = 0.0001; weight_decay = 0 2022-03-16 04:43:46,280.280 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.attention.output.LayerNorm.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,281.281 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.intermediate.dense.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,281.281 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.intermediate.dense.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,281.281 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.output.dense.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,281.281 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.output.dense.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,281.281 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.output.LayerNorm.weight: lr = 0.0001; weight_decay = 0 2022-03-16 04:43:46,281.281 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.2.output.LayerNorm.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,281.281 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.attention.self.query.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,281.281 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.attention.self.query.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,281.281 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.attention.self.key.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,281.281 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.attention.self.key.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,281.281 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.attention.self.value.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,282.282 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.attention.self.value.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,282.282 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.attention.output.dense.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,282.282 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.attention.output.dense.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,282.282 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.attention.output.LayerNorm.weight: lr = 0.0001; weight_decay = 0 2022-03-16 04:43:46,282.282 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.attention.output.LayerNorm.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,282.282 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.intermediate.dense.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,282.282 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.intermediate.dense.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,282.282 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.output.dense.weight: lr = 0.0001; weight_decay = 0.05 2022-03-16 04:43:46,282.282 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.output.dense.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,283.283 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.output.LayerNorm.weight: lr = 0.0001; weight_decay = 0 2022-03-16 04:43:46,283.283 2829:tagger_caption_uni_pipeline_expanding.py:655 get_parameter_groups(): layer.3.output.LayerNorm.bias: lr = 0.0001; weight_decay = 0.0 2022-03-16 04:43:46,283.283 2829:tagger_caption_uni_pipeline_expanding.py:692 get_optimizer(): LR Updating... learning rate 0.0001, 2022-03-16 04:43:46,292.292 2829:tagger_caption_uni_pipeline_expanding.py:608 train(): AdamW ( Parameter Group 0 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 lr: 0.0001 param_names: ['module.cls_token', 'module.pos_embed', 'module.patch_embed.proj.weight', 'module.head.weight'] weight_decay: 0.05 Parameter Group 1 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 lr: 0.0001 param_names: ['module.patch_embed.proj.bias', 'module.head.bias'] weight_decay: 0.0 Parameter Group 2 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 lr: 0.0001 param_names: ['word_embeddings.weight', 'position_embeddings.weight', 'token_type_embeddings.weight'] weight_decay: 0.05 Parameter Group 3 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 lr: 0.0001 param_names: ['LayerNorm.weight', 'LayerNorm.bias'] weight_decay: 0 Parameter Group 4 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 lr: 1e-05 param_names: ['0.norm1.weight', '0.attn.qkv.weight', '0.attn.proj.weight', '0.norm2.weight', '0.mlp.fc1.weight', '0.mlp.fc2.weight', '1.norm1.weight', '1.attn.qkv.weight', '1.attn.proj.weight', '1.norm2.weight', '1.mlp.fc1.weight', '1.mlp.fc2.weight', '2.norm1.weight', '2.attn.qkv.weight', '2.attn.proj.weight', '2.norm2.weight', '2.mlp.fc1.weight', '2.mlp.fc2.weight', '3.norm1.weight', '3.attn.qkv.weight', '3.attn.proj.weight', '3.norm2.weight', '3.mlp.fc1.weight', '3.mlp.fc2.weight', '4.norm1.weight', '4.attn.qkv.weight', '4.attn.proj.weight', '4.norm2.weight', '4.mlp.fc1.weight', '4.mlp.fc2.weight', '5.norm1.weight', '5.attn.qkv.weight', '5.attn.proj.weight', '5.norm2.weight', '5.mlp.fc1.weight', '5.mlp.fc2.weight', '6.norm1.weight', '6.attn.qkv.weight', '6.attn.proj.weight', '6.norm2.weight', '6.mlp.fc1.weight', '6.mlp.fc2.weight', '7.norm1.weight', '7.attn.qkv.weight', '7.attn.proj.weight', '7.norm2.weight', '7.mlp.fc1.weight', '7.mlp.fc2.weight'] weight_decay: 0.05 Parameter Group 5 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 lr: 1e-05 param_names: ['0.norm1.bias', '0.attn.qkv.bias', '0.attn.proj.bias', '0.norm2.bias', '0.mlp.fc1.bias', '0.mlp.fc2.bias', '1.norm1.bias', '1.attn.qkv.bias', '1.attn.proj.bias', '1.norm2.bias', '1.mlp.fc1.bias', '1.mlp.fc2.bias', '2.norm1.bias', '2.attn.qkv.bias', '2.attn.proj.bias', '2.norm2.bias', '2.mlp.fc1.bias', '2.mlp.fc2.bias', '3.norm1.bias', '3.attn.qkv.bias', '3.attn.proj.bias', '3.norm2.bias', '3.mlp.fc1.bias', '3.mlp.fc2.bias', '4.norm1.bias', '4.attn.qkv.bias', '4.attn.proj.bias', '4.norm2.bias', '4.mlp.fc1.bias', '4.mlp.fc2.bias', '5.norm1.bias', '5.attn.qkv.bias', '5.attn.proj.bias', '5.norm2.bias', '5.mlp.fc1.bias', '5.mlp.fc2.bias', '6.norm1.bias', '6.attn.qkv.bias', '6.attn.proj.bias', '6.norm2.bias', '6.mlp.fc1.bias', '6.mlp.fc2.bias', '7.norm1.bias', '7.attn.qkv.bias', '7.attn.proj.bias', '7.norm2.bias', '7.mlp.fc1.bias', '7.mlp.fc2.bias'] weight_decay: 0.0 Parameter Group 6 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 lr: 0.0001 param_names: ['0.norm1.weight', '0.attn.qkv.weight', '0.attn.proj.weight', '0.norm2.weight', '0.mlp.fc1.weight', '0.mlp.fc2.weight', '1.norm1.weight', '1.attn.qkv.weight', '1.attn.proj.weight', '1.norm2.weight', '1.mlp.fc1.weight', '1.mlp.fc2.weight', '2.norm1.weight', '2.attn.qkv.weight', '2.attn.proj.weight', '2.norm2.weight', '2.mlp.fc1.weight', '2.mlp.fc2.weight', '3.norm1.weight', '3.attn.qkv.weight', '3.attn.proj.weight', '3.norm2.weight', '3.mlp.fc1.weight', '3.mlp.fc2.weight'] weight_decay: 0.05 Parameter Group 7 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 lr: 0.0001 param_names: ['0.norm1.bias', '0.attn.qkv.bias', '0.attn.proj.bias', '0.norm2.bias', '0.mlp.fc1.bias', '0.mlp.fc2.bias', '1.norm1.bias', '1.attn.qkv.bias', '1.attn.proj.bias', '1.norm2.bias', '1.mlp.fc1.bias', '1.mlp.fc2.bias', '2.norm1.bias', '2.attn.qkv.bias', '2.attn.proj.bias', '2.norm2.bias', '2.mlp.fc1.bias', '2.mlp.fc2.bias', '3.norm1.bias', '3.attn.qkv.bias', '3.attn.proj.bias', '3.norm2.bias', '3.mlp.fc1.bias', '3.mlp.fc2.bias'] weight_decay: 0.0 Parameter Group 8 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 lr: 1e-05 param_names: ['0.norm1.weight', '0.attn.qkv.weight', '0.attn.proj.weight', '0.norm2.weight', '0.mlp.fc1.weight', '0.mlp.fc2.weight', '1.norm1.weight', '1.attn.qkv.weight', '1.attn.proj.weight', '1.norm2.weight', '1.mlp.fc1.weight', '1.mlp.fc2.weight', '2.norm1.weight', '2.attn.qkv.weight', '2.attn.proj.weight', '2.norm2.weight', '2.mlp.fc1.weight', '2.mlp.fc2.weight', '3.norm1.weight', '3.attn.qkv.weight', '3.attn.proj.weight', '3.norm2.weight', '3.mlp.fc1.weight', '3.mlp.fc2.weight'] weight_decay: 0.05 Parameter Group 9 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 lr: 1e-05 param_names: ['0.norm1.bias', '0.attn.qkv.bias', '0.attn.proj.bias', '0.norm2.bias', '0.mlp.fc1.bias', '0.mlp.fc2.bias', '1.norm1.bias', '1.attn.qkv.bias', '1.attn.proj.bias', '1.norm2.bias', '1.mlp.fc1.bias', '1.mlp.fc2.bias', '2.norm1.bias', '2.attn.qkv.bias', '2.attn.proj.bias', '2.norm2.bias', '2.mlp.fc1.bias', '2.mlp.fc2.bias', '3.norm1.bias', '3.attn.qkv.bias', '3.attn.proj.bias', '3.norm2.bias', '3.mlp.fc1.bias', '3.mlp.fc2.bias'] weight_decay: 0.0 Parameter Group 10 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 lr: 0.0001 param_names: ['dense.weight'] weight_decay: 0.05 Parameter Group 11 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 lr: 0.0001 param_names: ['dense.bias'] weight_decay: 0.0 Parameter Group 12 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 lr: 1e-05 param_names: ['dense.weight'] weight_decay: 0.05 Parameter Group 13 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 lr: 1e-05 param_names: ['dense.bias'] weight_decay: 0.0 Parameter Group 14 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 lr: 1e-05 param_names: ['predictions.bias', 'predictions.transform.dense.bias', 'predictions.transform.LayerNorm.weight', 'predictions.transform.LayerNorm.bias'] weight_decay: 0.0 Parameter Group 15 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 lr: 1e-05 param_names: ['predictions.transform.dense.weight', 'predictions.decoder.weight'] weight_decay: 0.05 Parameter Group 16 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 lr: 0.0001 param_names: ['layer.0.attention.self.query.weight', 'layer.0.attention.self.key.weight', 'layer.0.attention.self.value.weight', 'layer.0.attention.output.dense.weight', 'layer.0.intermediate.dense.weight', 'layer.0.output.dense.weight', 'layer.1.attention.self.query.weight', 'layer.1.attention.self.key.weight', 'layer.1.attention.self.value.weight', 'layer.1.attention.output.dense.weight', 'layer.1.intermediate.dense.weight', 'layer.1.output.dense.weight', 'layer.2.attention.self.query.weight', 'layer.2.attention.self.key.weight', 'layer.2.attention.self.value.weight', 'layer.2.attention.output.dense.weight', 'layer.2.intermediate.dense.weight', 'layer.2.output.dense.weight', 'layer.3.attention.self.query.weight', 'layer.3.attention.self.key.weight', 'layer.3.attention.self.value.weight', 'layer.3.attention.output.dense.weight', 'layer.3.intermediate.dense.weight', 'layer.3.output.dense.weight'] weight_decay: 0.05 Parameter Group 17 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 lr: 0.0001 param_names: ['layer.0.attention.self.query.bias', 'layer.0.attention.self.key.bias', 'layer.0.attention.self.value.bias', 'layer.0.attention.output.dense.bias', 'layer.0.attention.output.LayerNorm.weight', 'layer.0.attention.output.LayerNorm.bias', 'layer.0.intermediate.dense.bias', 'layer.0.output.dense.bias', 'layer.0.output.LayerNorm.weight', 'layer.0.output.LayerNorm.bias', 'layer.1.attention.self.query.bias', 'layer.1.attention.self.key.bias', 'layer.1.attention.self.value.bias', 'layer.1.attention.output.dense.bias', 'layer.1.attention.output.LayerNorm.weight', 'layer.1.attention.output.LayerNorm.bias', 'layer.1.intermediate.dense.bias', 'layer.1.output.dense.bias', 'layer.1.output.LayerNorm.weight', 'layer.1.output.LayerNorm.bias', 'layer.2.attention.self.query.bias', 'layer.2.attention.self.key.bias', 'layer.2.attention.self.value.bias', 'layer.2.attention.output.dense.bias', 'layer.2.attention.output.LayerNorm.weight', 'layer.2.attention.output.LayerNorm.bias', 'layer.2.intermediate.dense.bias', 'layer.2.output.dense.bias', 'layer.2.output.LayerNorm.weight', 'layer.2.output.LayerNorm.bias', 'layer.3.attention.self.query.bias', 'layer.3.attention.self.key.bias', 'layer.3.attention.self.value.bias', 'layer.3.attention.output.dense.bias', 'layer.3.attention.output.LayerNorm.weight', 'layer.3.attention.output.LayerNorm.bias', 'layer.3.intermediate.dense.bias', 'layer.3.output.dense.bias', 'layer.3.output.LayerNorm.weight', 'layer.3.output.LayerNorm.bias'] weight_decay: 0.0 ) 2022-03-16 04:43:46,459.459 2829:checkpoint.py:240 load(): Loading checkpoint from ./output/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_20_epoch_lr_5e-5_BS_1024_loss_focal_crop_0.08_bert_category/snapshot/model_iter_0081989.pt 2022-03-16 04:43:58,590.590 2829:checkpoint.py:99 align_and_update_state_dicts(): module.image_encoder.module.cls_token loaded from image_encoder.module.cls_token of shape (1, 1, 768) 2022-03-16 04:43:58,590.590 2829:checkpoint.py:99 align_and_update_state_dicts(): module.image_encoder.module.head.bias loaded from image_encoder.module.head.bias of shape (1000,) 2022-03-16 04:43:58,590.590 2829:checkpoint.py:99 align_and_update_state_dicts(): module.image_encoder.module.head.weight loaded from image_encoder.module.head.weight of shape (1000, 768) 2022-03-16 04:43:58,590.590 2829:checkpoint.py:99 align_and_update_state_dicts(): module.image_encoder.module.patch_embed.proj.bias loaded from image_encoder.module.patch_embed.proj.bias of shape (768,) 2022-03-16 04:43:58,590.590 2829:checkpoint.py:99 align_and_update_state_dicts(): module.image_encoder.module.patch_embed.proj.weight loaded from image_encoder.module.patch_embed.proj.weight of shape (768, 3, 16, 16) 2022-03-16 04:43:58,590.590 2829:checkpoint.py:99 align_and_update_state_dicts(): module.image_encoder.module.pos_embed loaded from image_encoder.module.pos_embed of shape (1, 577, 768) 2022-03-16 04:43:58,590.590 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.0.attn.proj.bias loaded from bert.encoder.blocks.0.attn.proj.bias of shape (768,) 2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.0.attn.proj.weight loaded from bert.encoder.blocks.0.attn.proj.weight of shape (768, 768) 2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.0.attn.qkv.bias loaded from bert.encoder.blocks.0.attn.qkv.bias of shape (2304,) 2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.0.attn.qkv.weight loaded from bert.encoder.blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.0.mlp.fc1.bias loaded from bert.encoder.blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.0.mlp.fc1.weight loaded from bert.encoder.blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.0.mlp.fc2.bias loaded from bert.encoder.blocks.0.mlp.fc2.bias of shape (768,) 2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.0.mlp.fc2.weight loaded from bert.encoder.blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.0.norm1.bias loaded from bert.encoder.blocks.0.norm1.bias of shape (768,) 2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.0.norm1.weight loaded from bert.encoder.blocks.0.norm1.weight of shape (768,) 2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.0.norm2.bias loaded from bert.encoder.blocks.0.norm2.bias of shape (768,) 2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.0.norm2.weight loaded from bert.encoder.blocks.0.norm2.weight of shape (768,) 2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.1.attn.proj.bias loaded from bert.encoder.blocks.1.attn.proj.bias of shape (768,) 2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.1.attn.proj.weight loaded from bert.encoder.blocks.1.attn.proj.weight of shape (768, 768) 2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.1.attn.qkv.bias loaded from bert.encoder.blocks.1.attn.qkv.bias of shape (2304,) 2022-03-16 04:43:58,591.591 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.1.attn.qkv.weight loaded from bert.encoder.blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-16 04:43:58,592.592 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.1.mlp.fc1.bias loaded from bert.encoder.blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-16 04:43:58,592.592 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.1.mlp.fc1.weight loaded from bert.encoder.blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-16 04:43:58,592.592 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.1.mlp.fc2.bias loaded from bert.encoder.blocks.1.mlp.fc2.bias of shape (768,) 2022-03-16 04:43:58,592.592 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.1.mlp.fc2.weight loaded from bert.encoder.blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-16 04:43:58,592.592 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.1.norm1.bias loaded from bert.encoder.blocks.1.norm1.bias of shape (768,) 2022-03-16 04:43:58,592.592 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.1.norm1.weight loaded from bert.encoder.blocks.1.norm1.weight of shape (768,) 2022-03-16 04:43:58,592.592 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.1.norm2.bias loaded from bert.encoder.blocks.1.norm2.bias of shape (768,) 2022-03-16 04:43:58,592.592 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.1.norm2.weight loaded from bert.encoder.blocks.1.norm2.weight of shape (768,) 2022-03-16 04:43:58,592.592 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.10.attn.proj.bias loaded from bert.encoder.blocks.10.attn.proj.bias of shape (768,) 2022-03-16 04:43:58,592.592 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.10.attn.proj.weight loaded from bert.encoder.blocks.10.attn.proj.weight of shape (768, 768) 2022-03-16 04:43:58,592.592 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.10.attn.qkv.bias loaded from bert.encoder.blocks.10.attn.qkv.bias of shape (2304,) 2022-03-16 04:43:58,592.592 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.10.attn.qkv.weight loaded from bert.encoder.blocks.10.attn.qkv.weight of shape (2304, 768) 2022-03-16 04:43:58,592.592 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.10.mlp.fc1.bias loaded from bert.encoder.blocks.10.mlp.fc1.bias of shape (3072,) 2022-03-16 04:43:58,593.593 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.10.mlp.fc1.weight loaded from bert.encoder.blocks.10.mlp.fc1.weight of shape (3072, 768) 2022-03-16 04:43:58,593.593 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.10.mlp.fc2.bias loaded from bert.encoder.blocks.10.mlp.fc2.bias of shape (768,) 2022-03-16 04:43:58,593.593 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.10.mlp.fc2.weight loaded from bert.encoder.blocks.10.mlp.fc2.weight of shape (768, 3072) 2022-03-16 04:43:58,593.593 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.10.norm1.bias loaded from bert.encoder.blocks.10.norm1.bias of shape (768,) 2022-03-16 04:43:58,593.593 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.10.norm1.weight loaded from bert.encoder.blocks.10.norm1.weight of shape (768,) 2022-03-16 04:43:58,593.593 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.10.norm2.bias loaded from bert.encoder.blocks.10.norm2.bias of shape (768,) 2022-03-16 04:43:58,593.593 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.10.norm2.weight loaded from bert.encoder.blocks.10.norm2.weight of shape (768,) 2022-03-16 04:43:58,593.593 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.11.attn.proj.bias loaded from bert.encoder.blocks.11.attn.proj.bias of shape (768,) 2022-03-16 04:43:58,593.593 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.11.attn.proj.weight loaded from bert.encoder.blocks.11.attn.proj.weight of shape (768, 768) 2022-03-16 04:43:58,593.593 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.11.attn.qkv.bias loaded from bert.encoder.blocks.11.attn.qkv.bias of shape (2304,) 2022-03-16 04:43:58,593.593 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.11.attn.qkv.weight loaded from bert.encoder.blocks.11.attn.qkv.weight of shape (2304, 768) 2022-03-16 04:43:58,593.593 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.11.mlp.fc1.bias loaded from bert.encoder.blocks.11.mlp.fc1.bias of shape (3072,) 2022-03-16 04:43:58,594.594 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.11.mlp.fc1.weight loaded from bert.encoder.blocks.11.mlp.fc1.weight of shape (3072, 768) 2022-03-16 04:43:58,594.594 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.11.mlp.fc2.bias loaded from bert.encoder.blocks.11.mlp.fc2.bias of shape (768,) 2022-03-16 04:43:58,594.594 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.11.mlp.fc2.weight loaded from bert.encoder.blocks.11.mlp.fc2.weight of shape (768, 3072) 2022-03-16 04:43:58,594.594 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.11.norm1.bias loaded from bert.encoder.blocks.11.norm1.bias of shape (768,) 2022-03-16 04:43:58,594.594 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.11.norm1.weight loaded from bert.encoder.blocks.11.norm1.weight of shape (768,) 2022-03-16 04:43:58,594.594 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.11.norm2.bias loaded from bert.encoder.blocks.11.norm2.bias of shape (768,) 2022-03-16 04:43:58,594.594 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.11.norm2.weight loaded from bert.encoder.blocks.11.norm2.weight of shape (768,) 2022-03-16 04:43:58,594.594 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.2.attn.proj.bias loaded from bert.encoder.blocks.2.attn.proj.bias of shape (768,) 2022-03-16 04:43:58,594.594 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.2.attn.proj.weight loaded from bert.encoder.blocks.2.attn.proj.weight of shape (768, 768) 2022-03-16 04:43:58,594.594 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.2.attn.qkv.bias loaded from bert.encoder.blocks.2.attn.qkv.bias of shape (2304,) 2022-03-16 04:43:58,594.594 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.2.attn.qkv.weight loaded from bert.encoder.blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-16 04:43:58,594.594 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.2.mlp.fc1.bias loaded from bert.encoder.blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-16 04:43:58,594.594 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.2.mlp.fc1.weight loaded from bert.encoder.blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-16 04:43:58,595.595 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.2.mlp.fc2.bias loaded from bert.encoder.blocks.2.mlp.fc2.bias of shape (768,) 2022-03-16 04:43:58,595.595 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.2.mlp.fc2.weight loaded from bert.encoder.blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-16 04:43:58,595.595 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.2.norm1.bias loaded from bert.encoder.blocks.2.norm1.bias of shape (768,) 2022-03-16 04:43:58,595.595 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.2.norm1.weight loaded from bert.encoder.blocks.2.norm1.weight of shape (768,) 2022-03-16 04:43:58,595.595 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.2.norm2.bias loaded from bert.encoder.blocks.2.norm2.bias of shape (768,) 2022-03-16 04:43:58,595.595 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.2.norm2.weight loaded from bert.encoder.blocks.2.norm2.weight of shape (768,) 2022-03-16 04:43:58,595.595 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.3.attn.proj.bias loaded from bert.encoder.blocks.3.attn.proj.bias of shape (768,) 2022-03-16 04:43:58,595.595 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.3.attn.proj.weight loaded from bert.encoder.blocks.3.attn.proj.weight of shape (768, 768) 2022-03-16 04:43:58,595.595 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.3.attn.qkv.bias loaded from bert.encoder.blocks.3.attn.qkv.bias of shape (2304,) 2022-03-16 04:43:58,595.595 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.3.attn.qkv.weight loaded from bert.encoder.blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-16 04:43:58,595.595 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.3.mlp.fc1.bias loaded from bert.encoder.blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-16 04:43:58,595.595 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.3.mlp.fc1.weight loaded from bert.encoder.blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-16 04:43:58,596.596 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.3.mlp.fc2.bias loaded from bert.encoder.blocks.3.mlp.fc2.bias of shape (768,) 2022-03-16 04:43:58,596.596 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.3.mlp.fc2.weight loaded from bert.encoder.blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-16 04:43:58,596.596 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.3.norm1.bias loaded from bert.encoder.blocks.3.norm1.bias of shape (768,) 2022-03-16 04:43:58,596.596 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.3.norm1.weight loaded from bert.encoder.blocks.3.norm1.weight of shape (768,) 2022-03-16 04:43:58,596.596 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.3.norm2.bias loaded from bert.encoder.blocks.3.norm2.bias of shape (768,) 2022-03-16 04:43:58,596.596 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.3.norm2.weight loaded from bert.encoder.blocks.3.norm2.weight of shape (768,) 2022-03-16 04:43:58,596.596 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.4.attn.proj.bias loaded from bert.encoder.blocks.4.attn.proj.bias of shape (768,) 2022-03-16 04:43:58,596.596 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.4.attn.proj.weight loaded from bert.encoder.blocks.4.attn.proj.weight of shape (768, 768) 2022-03-16 04:43:58,596.596 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.4.attn.qkv.bias loaded from bert.encoder.blocks.4.attn.qkv.bias of shape (2304,) 2022-03-16 04:43:58,596.596 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.4.attn.qkv.weight loaded from bert.encoder.blocks.4.attn.qkv.weight of shape (2304, 768) 2022-03-16 04:43:58,596.596 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.4.mlp.fc1.bias loaded from bert.encoder.blocks.4.mlp.fc1.bias of shape (3072,) 2022-03-16 04:43:58,597.597 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.4.mlp.fc1.weight loaded from bert.encoder.blocks.4.mlp.fc1.weight of shape (3072, 768) 2022-03-16 04:43:58,597.597 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.4.mlp.fc2.bias loaded from bert.encoder.blocks.4.mlp.fc2.bias of shape (768,) 2022-03-16 04:43:58,597.597 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.4.mlp.fc2.weight loaded from bert.encoder.blocks.4.mlp.fc2.weight of shape (768, 3072) 2022-03-16 04:43:58,597.597 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.4.norm1.bias loaded from bert.encoder.blocks.4.norm1.bias of shape (768,) 2022-03-16 04:43:58,597.597 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.4.norm1.weight loaded from bert.encoder.blocks.4.norm1.weight of shape (768,) 2022-03-16 04:43:58,597.597 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.4.norm2.bias loaded from bert.encoder.blocks.4.norm2.bias of shape (768,) 2022-03-16 04:43:58,597.597 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.4.norm2.weight loaded from bert.encoder.blocks.4.norm2.weight of shape (768,) 2022-03-16 04:43:58,597.597 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.5.attn.proj.bias loaded from bert.encoder.blocks.5.attn.proj.bias of shape (768,) 2022-03-16 04:43:58,597.597 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.5.attn.proj.weight loaded from bert.encoder.blocks.5.attn.proj.weight of shape (768, 768) 2022-03-16 04:43:58,597.597 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.5.attn.qkv.bias loaded from bert.encoder.blocks.5.attn.qkv.bias of shape (2304,) 2022-03-16 04:43:58,598.598 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.5.attn.qkv.weight loaded from bert.encoder.blocks.5.attn.qkv.weight of shape (2304, 768) 2022-03-16 04:43:58,598.598 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.5.mlp.fc1.bias loaded from bert.encoder.blocks.5.mlp.fc1.bias of shape (3072,) 2022-03-16 04:43:58,598.598 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.5.mlp.fc1.weight loaded from bert.encoder.blocks.5.mlp.fc1.weight of shape (3072, 768) 2022-03-16 04:43:58,598.598 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.5.mlp.fc2.bias loaded from bert.encoder.blocks.5.mlp.fc2.bias of shape (768,) 2022-03-16 04:43:58,598.598 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.5.mlp.fc2.weight loaded from bert.encoder.blocks.5.mlp.fc2.weight of shape (768, 3072) 2022-03-16 04:43:58,598.598 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.5.norm1.bias loaded from bert.encoder.blocks.5.norm1.bias of shape (768,) 2022-03-16 04:43:58,598.598 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.5.norm1.weight loaded from bert.encoder.blocks.5.norm1.weight of shape (768,) 2022-03-16 04:43:58,598.598 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.5.norm2.bias loaded from bert.encoder.blocks.5.norm2.bias of shape (768,) 2022-03-16 04:43:58,598.598 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.5.norm2.weight loaded from bert.encoder.blocks.5.norm2.weight of shape (768,) 2022-03-16 04:43:58,598.598 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.6.attn.proj.bias loaded from bert.encoder.blocks.6.attn.proj.bias of shape (768,) 2022-03-16 04:43:58,598.598 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.6.attn.proj.weight loaded from bert.encoder.blocks.6.attn.proj.weight of shape (768, 768) 2022-03-16 04:43:58,598.598 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.6.attn.qkv.bias loaded from bert.encoder.blocks.6.attn.qkv.bias of shape (2304,) 2022-03-16 04:43:58,598.598 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.6.attn.qkv.weight loaded from bert.encoder.blocks.6.attn.qkv.weight of shape (2304, 768) 2022-03-16 04:43:58,599.599 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.6.mlp.fc1.bias loaded from bert.encoder.blocks.6.mlp.fc1.bias of shape (3072,) 2022-03-16 04:43:58,599.599 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.6.mlp.fc1.weight loaded from bert.encoder.blocks.6.mlp.fc1.weight of shape (3072, 768) 2022-03-16 04:43:58,599.599 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.6.mlp.fc2.bias loaded from bert.encoder.blocks.6.mlp.fc2.bias of shape (768,) 2022-03-16 04:43:58,599.599 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.6.mlp.fc2.weight loaded from bert.encoder.blocks.6.mlp.fc2.weight of shape (768, 3072) 2022-03-16 04:43:58,599.599 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.6.norm1.bias loaded from bert.encoder.blocks.6.norm1.bias of shape (768,) 2022-03-16 04:43:58,599.599 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.6.norm1.weight loaded from bert.encoder.blocks.6.norm1.weight of shape (768,) 2022-03-16 04:43:58,599.599 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.6.norm2.bias loaded from bert.encoder.blocks.6.norm2.bias of shape (768,) 2022-03-16 04:43:58,599.599 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.6.norm2.weight loaded from bert.encoder.blocks.6.norm2.weight of shape (768,) 2022-03-16 04:43:58,599.599 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.7.attn.proj.bias loaded from bert.encoder.blocks.7.attn.proj.bias of shape (768,) 2022-03-16 04:43:58,599.599 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.7.attn.proj.weight loaded from bert.encoder.blocks.7.attn.proj.weight of shape (768, 768) 2022-03-16 04:43:58,599.599 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.7.attn.qkv.bias loaded from bert.encoder.blocks.7.attn.qkv.bias of shape (2304,) 2022-03-16 04:43:58,599.599 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.7.attn.qkv.weight loaded from bert.encoder.blocks.7.attn.qkv.weight of shape (2304, 768) 2022-03-16 04:43:58,600.600 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.7.mlp.fc1.bias loaded from bert.encoder.blocks.7.mlp.fc1.bias of shape (3072,) 2022-03-16 04:43:58,600.600 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.7.mlp.fc1.weight loaded from bert.encoder.blocks.7.mlp.fc1.weight of shape (3072, 768) 2022-03-16 04:43:58,600.600 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.7.mlp.fc2.bias loaded from bert.encoder.blocks.7.mlp.fc2.bias of shape (768,) 2022-03-16 04:43:58,600.600 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.7.mlp.fc2.weight loaded from bert.encoder.blocks.7.mlp.fc2.weight of shape (768, 3072) 2022-03-16 04:43:58,600.600 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.7.norm1.bias loaded from bert.encoder.blocks.7.norm1.bias of shape (768,) 2022-03-16 04:43:58,600.600 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.7.norm1.weight loaded from bert.encoder.blocks.7.norm1.weight of shape (768,) 2022-03-16 04:43:58,600.600 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.7.norm2.bias loaded from bert.encoder.blocks.7.norm2.bias of shape (768,) 2022-03-16 04:43:58,600.600 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.7.norm2.weight loaded from bert.encoder.blocks.7.norm2.weight of shape (768,) 2022-03-16 04:43:58,600.600 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.8.attn.proj.bias loaded from bert.encoder.blocks.8.attn.proj.bias of shape (768,) 2022-03-16 04:43:58,600.600 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.8.attn.proj.weight loaded from bert.encoder.blocks.8.attn.proj.weight of shape (768, 768) 2022-03-16 04:43:58,600.600 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.8.attn.qkv.bias loaded from bert.encoder.blocks.8.attn.qkv.bias of shape (2304,) 2022-03-16 04:43:58,600.600 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.8.attn.qkv.weight loaded from bert.encoder.blocks.8.attn.qkv.weight of shape (2304, 768) 2022-03-16 04:43:58,600.600 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.8.mlp.fc1.bias loaded from bert.encoder.blocks.8.mlp.fc1.bias of shape (3072,) 2022-03-16 04:43:58,601.601 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.8.mlp.fc1.weight loaded from bert.encoder.blocks.8.mlp.fc1.weight of shape (3072, 768) 2022-03-16 04:43:58,601.601 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.8.mlp.fc2.bias loaded from bert.encoder.blocks.8.mlp.fc2.bias of shape (768,) 2022-03-16 04:43:58,601.601 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.8.mlp.fc2.weight loaded from bert.encoder.blocks.8.mlp.fc2.weight of shape (768, 3072) 2022-03-16 04:43:58,601.601 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.8.norm1.bias loaded from bert.encoder.blocks.8.norm1.bias of shape (768,) 2022-03-16 04:43:58,601.601 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.8.norm1.weight loaded from bert.encoder.blocks.8.norm1.weight of shape (768,) 2022-03-16 04:43:58,601.601 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.8.norm2.bias loaded from bert.encoder.blocks.8.norm2.bias of shape (768,) 2022-03-16 04:43:58,601.601 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.8.norm2.weight loaded from bert.encoder.blocks.8.norm2.weight of shape (768,) 2022-03-16 04:43:58,601.601 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.9.attn.proj.bias loaded from bert.encoder.blocks.9.attn.proj.bias of shape (768,) 2022-03-16 04:43:58,601.601 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.9.attn.proj.weight loaded from bert.encoder.blocks.9.attn.proj.weight of shape (768, 768) 2022-03-16 04:43:58,601.601 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.9.attn.qkv.bias loaded from bert.encoder.blocks.9.attn.qkv.bias of shape (2304,) 2022-03-16 04:43:58,601.601 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.9.attn.qkv.weight loaded from bert.encoder.blocks.9.attn.qkv.weight of shape (2304, 768) 2022-03-16 04:43:58,602.602 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.9.mlp.fc1.bias loaded from bert.encoder.blocks.9.mlp.fc1.bias of shape (3072,) 2022-03-16 04:43:58,602.602 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.9.mlp.fc1.weight loaded from bert.encoder.blocks.9.mlp.fc1.weight of shape (3072, 768) 2022-03-16 04:43:58,602.602 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.9.mlp.fc2.bias loaded from bert.encoder.blocks.9.mlp.fc2.bias of shape (768,) 2022-03-16 04:43:58,602.602 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.9.mlp.fc2.weight loaded from bert.encoder.blocks.9.mlp.fc2.weight of shape (768, 3072) 2022-03-16 04:43:58,602.602 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.9.norm1.bias loaded from bert.encoder.blocks.9.norm1.bias of shape (768,) 2022-03-16 04:43:58,602.602 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.9.norm1.weight loaded from bert.encoder.blocks.9.norm1.weight of shape (768,) 2022-03-16 04:43:58,602.602 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.9.norm2.bias loaded from bert.encoder.blocks.9.norm2.bias of shape (768,) 2022-03-16 04:43:58,602.602 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.encoder.blocks.9.norm2.weight loaded from bert.encoder.blocks.9.norm2.weight of shape (768,) 2022-03-16 04:43:58,602.602 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.pooler.dense.bias loaded from bert.pooler.dense.bias of shape (768,) 2022-03-16 04:43:58,603.603 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.pooler.dense.weight loaded from bert.pooler.dense.weight of shape (768, 768) 2022-03-16 04:43:58,603.603 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.tag_logit.predictions.bias loaded from bert.tag_logit.predictions.bias of shape (30522,) 2022-03-16 04:43:58,603.603 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.tag_logit.predictions.decoder.weight loaded from bert.tag_logit.predictions.decoder.weight of shape (30522, 768) 2022-03-16 04:43:58,603.603 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.tag_logit.predictions.transform.LayerNorm.bias loaded from bert.tag_logit.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-16 04:43:58,603.603 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.tag_logit.predictions.transform.LayerNorm.weight loaded from bert.tag_logit.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-16 04:43:58,603.603 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.tag_logit.predictions.transform.dense.bias loaded from bert.tag_logit.predictions.transform.dense.bias of shape (768,) 2022-03-16 04:43:58,603.603 2829:checkpoint.py:99 align_and_update_state_dicts(): module.module.bert.tag_logit.predictions.transform.dense.weight loaded from bert.tag_logit.predictions.transform.dense.weight of shape (768, 768) 2022-03-16 04:43:58,603.603 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 158; loaded = 158 2022-03-16 04:43:58,604.604 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = [] 2022-03-16 04:43:58,608.608 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = []; total = 0 2022-03-16 04:43:58,744.744 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.module.bert.embeddings.word_embeddings.weight', 'module.module.bert.embeddings.position_embeddings.weight', 'module.module.bert.embeddings.token_type_embeddings.weight', 'module.module.bert.embeddings.LayerNorm.weight', 'module.module.bert.embeddings.LayerNorm.bias', 'module.module.bert.extra_embeddings.word_embeddings.weight', 'module.module.bert.extra_embeddings.position_embeddings.weight', 'module.module.bert.extra_embeddings.token_type_embeddings.weight', 'module.module.bert.extra_embeddings.LayerNorm.weight', 'module.module.bert.extra_embeddings.LayerNorm.bias', 'module.module.bert.encoder.tag_blocks.0.norm1.weight', 'module.module.bert.encoder.tag_blocks.0.norm1.bias', 'module.module.bert.encoder.tag_blocks.0.attn.qkv.weight', 'module.module.bert.encoder.tag_blocks.0.attn.qkv.bias', 'module.module.bert.encoder.tag_blocks.0.attn.proj.weight', 'module.module.bert.encoder.tag_blocks.0.attn.proj.bias', 'module.module.bert.encoder.tag_blocks.0.norm2.weight', 'module.module.bert.encoder.tag_blocks.0.norm2.bias', 'module.module.bert.encoder.tag_blocks.0.mlp.fc1.weight', 'module.module.bert.encoder.tag_blocks.0.mlp.fc1.bias', 'module.module.bert.encoder.tag_blocks.0.mlp.fc2.weight', 'module.module.bert.encoder.tag_blocks.0.mlp.fc2.bias', 'module.module.bert.encoder.tag_blocks.1.norm1.weight', 'module.module.bert.encoder.tag_blocks.1.norm1.bias', 'module.module.bert.encoder.tag_blocks.1.attn.qkv.weight', 'module.module.bert.encoder.tag_blocks.1.attn.qkv.bias', 'module.module.bert.encoder.tag_blocks.1.attn.proj.weight', 'module.module.bert.encoder.tag_blocks.1.attn.proj.bias', 'module.module.bert.encoder.tag_blocks.1.norm2.weight', 'module.module.bert.encoder.tag_blocks.1.norm2.bias', 'module.module.bert.encoder.tag_blocks.1.mlp.fc1.weight', 'module.module.bert.encoder.tag_blocks.1.mlp.fc1.bias', 'module.module.bert.encoder.tag_blocks.1.mlp.fc2.weight', 'module.module.bert.encoder.tag_blocks.1.mlp.fc2.bias', 'module.module.bert.encoder.tag_blocks.2.norm1.weight', 'module.module.bert.encoder.tag_blocks.2.norm1.bias', 'module.module.bert.encoder.tag_blocks.2.attn.qkv.weight', 'module.module.bert.encoder.tag_blocks.2.attn.qkv.bias', 'module.module.bert.encoder.tag_blocks.2.attn.proj.weight', 'module.module.bert.encoder.tag_blocks.2.attn.proj.bias', 'module.module.bert.encoder.tag_blocks.2.norm2.weight', 'module.module.bert.encoder.tag_blocks.2.norm2.bias', 'module.module.bert.encoder.tag_blocks.2.mlp.fc1.weight', 'module.module.bert.encoder.tag_blocks.2.mlp.fc1.bias', 'module.module.bert.encoder.tag_blocks.2.mlp.fc2.weight', 'module.module.bert.encoder.tag_blocks.2.mlp.fc2.bias', 'module.module.bert.encoder.tag_blocks.3.norm1.weight', 'module.module.bert.encoder.tag_blocks.3.norm1.bias', 'module.module.bert.encoder.tag_blocks.3.attn.qkv.weight', 'module.module.bert.encoder.tag_blocks.3.attn.qkv.bias', 'module.module.bert.encoder.tag_blocks.3.attn.proj.weight', 'module.module.bert.encoder.tag_blocks.3.attn.proj.bias', 'module.module.bert.encoder.tag_blocks.3.norm2.weight', 'module.module.bert.encoder.tag_blocks.3.norm2.bias', 'module.module.bert.encoder.tag_blocks.3.mlp.fc1.weight', 'module.module.bert.encoder.tag_blocks.3.mlp.fc1.bias', 'module.module.bert.encoder.tag_blocks.3.mlp.fc2.weight', 'module.module.bert.encoder.tag_blocks.3.mlp.fc2.bias', 'module.module.bert.caption_pooler.dense.weight', 'module.module.bert.caption_pooler.dense.bias', 'module.module.bert.decoder.layer.0.attention.self.query.weight', 'module.module.bert.decoder.layer.0.attention.self.query.bias', 'module.module.bert.decoder.layer.0.attention.self.key.weight', 'module.module.bert.decoder.layer.0.attention.self.key.bias', 'module.module.bert.decoder.layer.0.attention.self.value.weight', 'module.module.bert.decoder.layer.0.attention.self.value.bias', 'module.module.bert.decoder.layer.0.attention.output.dense.weight', 'module.module.bert.decoder.layer.0.attention.output.dense.bias', 'module.module.bert.decoder.layer.0.attention.output.LayerNorm.weight', 'module.module.bert.decoder.layer.0.attention.output.LayerNorm.bias', 'module.module.bert.decoder.layer.0.intermediate.dense.weight', 'module.module.bert.decoder.layer.0.intermediate.dense.bias', 'module.module.bert.decoder.layer.0.output.dense.weight', 'module.module.bert.decoder.layer.0.output.dense.bias', 'module.module.bert.decoder.layer.0.output.LayerNorm.weight', 'module.module.bert.decoder.layer.0.output.LayerNorm.bias', 'module.module.bert.decoder.layer.1.attention.self.query.weight', 'module.module.bert.decoder.layer.1.attention.self.query.bias', 'module.module.bert.decoder.layer.1.attention.self.key.weight', 'module.module.bert.decoder.layer.1.attention.self.key.bias', 'module.module.bert.decoder.layer.1.attention.self.value.weight', 'module.module.bert.decoder.layer.1.attention.self.value.bias', 'module.module.bert.decoder.layer.1.attention.output.dense.weight', 'module.module.bert.decoder.layer.1.attention.output.dense.bias', 'module.module.bert.decoder.layer.1.attention.output.LayerNorm.weight', 'module.module.bert.decoder.layer.1.attention.output.LayerNorm.bias', 'module.module.bert.decoder.layer.1.intermediate.dense.weight', 'module.module.bert.decoder.layer.1.intermediate.dense.bias', 'module.module.bert.decoder.layer.1.output.dense.weight', 'module.module.bert.decoder.layer.1.output.dense.bias', 'module.module.bert.decoder.layer.1.output.LayerNorm.weight', 'module.module.bert.decoder.layer.1.output.LayerNorm.bias', 'module.module.bert.decoder.layer.2.attention.self.query.weight', 'module.module.bert.decoder.layer.2.attention.self.query.bias', 'module.module.bert.decoder.layer.2.attention.self.key.weight', 'module.module.bert.decoder.layer.2.attention.self.key.bias', 'module.module.bert.decoder.layer.2.attention.self.value.weight', 'module.module.bert.decoder.layer.2.attention.self.value.bias', 'module.module.bert.decoder.layer.2.attention.output.dense.weight', 'module.module.bert.decoder.layer.2.attention.output.dense.bias', 'module.module.bert.decoder.layer.2.attention.output.LayerNorm.weight', 'module.module.bert.decoder.layer.2.attention.output.LayerNorm.bias', 'module.module.bert.decoder.layer.2.intermediate.dense.weight', 'module.module.bert.decoder.layer.2.intermediate.dense.bias', 'module.module.bert.decoder.layer.2.output.dense.weight', 'module.module.bert.decoder.layer.2.output.dense.bias', 'module.module.bert.decoder.layer.2.output.LayerNorm.weight', 'module.module.bert.decoder.layer.2.output.LayerNorm.bias', 'module.module.bert.decoder.layer.3.attention.self.query.weight', 'module.module.bert.decoder.layer.3.attention.self.query.bias', 'module.module.bert.decoder.layer.3.attention.self.key.weight', 'module.module.bert.decoder.layer.3.attention.self.key.bias', 'module.module.bert.decoder.layer.3.attention.self.value.weight', 'module.module.bert.decoder.layer.3.attention.self.value.bias', 'module.module.bert.decoder.layer.3.attention.output.dense.weight', 'module.module.bert.decoder.layer.3.attention.output.dense.bias', 'module.module.bert.decoder.layer.3.attention.output.LayerNorm.weight', 'module.module.bert.decoder.layer.3.attention.output.LayerNorm.bias', 'module.module.bert.decoder.layer.3.intermediate.dense.weight', 'module.module.bert.decoder.layer.3.intermediate.dense.bias', 'module.module.bert.decoder.layer.3.output.dense.weight', 'module.module.bert.decoder.layer.3.output.dense.bias', 'module.module.bert.decoder.layer.3.output.LayerNorm.weight', 'module.module.bert.decoder.layer.3.output.LayerNorm.bias', 'module.module.cls.predictions.bias', 'module.module.cls.predictions.transform.dense.weight', 'module.module.cls.predictions.transform.dense.bias', 'module.module.cls.predictions.transform.LayerNorm.weight', 'module.module.cls.predictions.transform.LayerNorm.bias', 'module.module.cls.predictions.decoder.weight'] 2022-03-16 04:43:58,755.755 2829:tagger_caption_uni_pipeline_expanding.py:625 train(): 2022-03-16 04:44:27,988.988 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = ; data = TaxCocoCaption; split = train; add_key = False; backend = cv; hold_buffer = 0; save_original = False 2022-03-16 04:44:28,245.245 2829:samplers.py:158 __init__(): before making divisible = 17711 2022-03-16 04:44:28,245.245 2829:samplers.py:161 __init__(): adjust to = 17712 2022-03-16 04:44:28,245.245 2829:uni_pipeline.py:509 get_data_loader(): sampler = 2022-03-16 04:44:28,246.246 2829:uni_pipeline.py:742 do_train(): DistributedDataParallel( (module): ImageCaptioning( (module): TaggerEncDecSplitForImageCaptioning( (bert): TaggerEncDecCLSEmbSplitBertImgModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(30522, 768, padding_idx=0) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0, inplace=False) ) (extra_embeddings): BertEmbeddings( (word_embeddings): Embedding(30522, 768, padding_idx=0) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0, inplace=False) ) (encoder): TIMMVitSplitEncoder( (blocks): ModuleList( (0): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (1): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (2): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (3): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (4): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (5): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (6): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (7): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (8): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (9): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (10): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (11): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) (tag_blocks): ModuleList( (0): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (1): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (2): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (3): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) ) (caption_pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) (tag_logit): BertCaptioningHeads( (predictions): BertLMPredictionHead( (transform): BertPredictionHeadTransform( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) ) (decoder): Linear(in_features=768, out_features=30522, bias=False) ) ) (decoder): BertEncoder( (layer): ModuleList( (0): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0, inplace=False) ) ) (1): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0, inplace=False) ) ) (2): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0, inplace=False) ) ) (3): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0, inplace=False) ) ) ) ) (img_embedding): Identity() (dropout): Identity() ) (cls): BertCaptioningHeads( (predictions): BertLMPredictionHead( (transform): BertPredictionHeadTransform( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) ) (decoder): Linear(in_features=768, out_features=30522, bias=False) ) ) (loss): BertCaptioningLoss( (log_soft): LogSoftmax(dim=1) (kl): KLDivLoss() ) (tag_loss): FocalLossWithLogitsNegLoss(alpha=0.5, gamma=1) ) (image_encoder): InputAsDict( (module): VisionTransformer( (patch_embed): PatchEmbed( (proj): Conv2d(3, 768, kernel_size=(16, 16), stride=(16, 16)) ) (pos_drop): Dropout(p=0.0, inplace=False) (blocks): ModuleList() (norm): Identity() (pre_logits): Identity() (head): Linear(in_features=768, out_features=1000, bias=True) ) ) ) ) 2022-03-16 04:44:28,251.251 2829:uni_pipeline.py:744 do_train(): : training=True 2022-03-16 04:44:28,252.252 2829:uni_pipeline.py:744 do_train(): module: training=True 2022-03-16 04:44:28,252.252 2829:uni_pipeline.py:744 do_train(): module.module: training=True 2022-03-16 04:44:28,252.252 2829:uni_pipeline.py:744 do_train(): module.module.bert: training=True 2022-03-16 04:44:28,252.252 2829:uni_pipeline.py:744 do_train(): module.module.bert.embeddings: training=True 2022-03-16 04:44:28,252.252 2829:uni_pipeline.py:744 do_train(): module.module.bert.embeddings.word_embeddings: training=True 2022-03-16 04:44:28,252.252 2829:uni_pipeline.py:744 do_train(): module.module.bert.embeddings.position_embeddings: training=True 2022-03-16 04:44:28,252.252 2829:uni_pipeline.py:744 do_train(): module.module.bert.embeddings.token_type_embeddings: training=True 2022-03-16 04:44:28,252.252 2829:uni_pipeline.py:744 do_train(): module.module.bert.embeddings.LayerNorm: training=True 2022-03-16 04:44:28,252.252 2829:uni_pipeline.py:744 do_train(): module.module.bert.embeddings.dropout: training=True 2022-03-16 04:44:28,253.253 2829:uni_pipeline.py:744 do_train(): module.module.bert.extra_embeddings: training=True 2022-03-16 04:44:28,253.253 2829:uni_pipeline.py:744 do_train(): module.module.bert.extra_embeddings.word_embeddings: training=True 2022-03-16 04:44:28,253.253 2829:uni_pipeline.py:744 do_train(): module.module.bert.extra_embeddings.position_embeddings: training=True 2022-03-16 04:44:28,253.253 2829:uni_pipeline.py:744 do_train(): module.module.bert.extra_embeddings.token_type_embeddings: training=True 2022-03-16 04:44:28,253.253 2829:uni_pipeline.py:744 do_train(): module.module.bert.extra_embeddings.LayerNorm: training=True 2022-03-16 04:44:28,253.253 2829:uni_pipeline.py:744 do_train(): module.module.bert.extra_embeddings.dropout: training=True 2022-03-16 04:44:28,253.253 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder: training=True 2022-03-16 04:44:28,253.253 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks: training=True 2022-03-16 04:44:28,253.253 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.0: training=True 2022-03-16 04:44:28,253.253 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.0.norm1: training=True 2022-03-16 04:44:28,254.254 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.0.attn: training=True 2022-03-16 04:44:28,254.254 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.0.attn.qkv: training=True 2022-03-16 04:44:28,254.254 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.0.attn.attn_drop: training=True 2022-03-16 04:44:28,254.254 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.0.attn.proj: training=True 2022-03-16 04:44:28,254.254 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.0.attn.proj_drop: training=True 2022-03-16 04:44:28,254.254 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.0.drop_path: training=True 2022-03-16 04:44:28,254.254 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.0.norm2: training=True 2022-03-16 04:44:28,254.254 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.0.mlp: training=True 2022-03-16 04:44:28,254.254 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.0.mlp.fc1: training=True 2022-03-16 04:44:28,254.254 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.0.mlp.act: training=True 2022-03-16 04:44:28,255.255 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.0.mlp.fc2: training=True 2022-03-16 04:44:28,255.255 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.0.mlp.drop: training=True 2022-03-16 04:44:28,255.255 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.1: training=True 2022-03-16 04:44:28,255.255 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.1.norm1: training=True 2022-03-16 04:44:28,255.255 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.1.attn: training=True 2022-03-16 04:44:28,255.255 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.1.attn.qkv: training=True 2022-03-16 04:44:28,255.255 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.1.attn.attn_drop: training=True 2022-03-16 04:44:28,255.255 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.1.attn.proj: training=True 2022-03-16 04:44:28,255.255 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.1.attn.proj_drop: training=True 2022-03-16 04:44:28,255.255 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.1.drop_path: training=True 2022-03-16 04:44:28,256.256 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.1.norm2: training=True 2022-03-16 04:44:28,256.256 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.1.mlp: training=True 2022-03-16 04:44:28,256.256 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.1.mlp.fc1: training=True 2022-03-16 04:44:28,256.256 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.1.mlp.act: training=True 2022-03-16 04:44:28,256.256 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.1.mlp.fc2: training=True 2022-03-16 04:44:28,256.256 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.1.mlp.drop: training=True 2022-03-16 04:44:28,256.256 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.2: training=True 2022-03-16 04:44:28,256.256 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.2.norm1: training=True 2022-03-16 04:44:28,256.256 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.2.attn: training=True 2022-03-16 04:44:28,256.256 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.2.attn.qkv: training=True 2022-03-16 04:44:28,256.256 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.2.attn.attn_drop: training=True 2022-03-16 04:44:28,256.256 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.2.attn.proj: training=True 2022-03-16 04:44:28,256.256 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.2.attn.proj_drop: training=True 2022-03-16 04:44:28,257.257 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.2.drop_path: training=True 2022-03-16 04:44:28,257.257 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.2.norm2: training=True 2022-03-16 04:44:28,257.257 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.2.mlp: training=True 2022-03-16 04:44:28,257.257 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.2.mlp.fc1: training=True 2022-03-16 04:44:28,257.257 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.2.mlp.act: training=True 2022-03-16 04:44:28,257.257 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.2.mlp.fc2: training=True 2022-03-16 04:44:28,257.257 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.2.mlp.drop: training=True 2022-03-16 04:44:28,257.257 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.3: training=True 2022-03-16 04:44:28,257.257 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.3.norm1: training=True 2022-03-16 04:44:28,257.257 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.3.attn: training=True 2022-03-16 04:44:28,257.257 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.3.attn.qkv: training=True 2022-03-16 04:44:28,257.257 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.3.attn.attn_drop: training=True 2022-03-16 04:44:28,257.257 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.3.attn.proj: training=True 2022-03-16 04:44:28,258.258 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.3.attn.proj_drop: training=True 2022-03-16 04:44:28,258.258 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.3.drop_path: training=True 2022-03-16 04:44:28,258.258 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.3.norm2: training=True 2022-03-16 04:44:28,258.258 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.3.mlp: training=True 2022-03-16 04:44:28,258.258 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.3.mlp.fc1: training=True 2022-03-16 04:44:28,258.258 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.3.mlp.act: training=True 2022-03-16 04:44:28,258.258 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.3.mlp.fc2: training=True 2022-03-16 04:44:28,258.258 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.3.mlp.drop: training=True 2022-03-16 04:44:28,258.258 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.4: training=True 2022-03-16 04:44:28,258.258 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.4.norm1: training=True 2022-03-16 04:44:28,258.258 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.4.attn: training=True 2022-03-16 04:44:28,258.258 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.4.attn.qkv: training=True 2022-03-16 04:44:28,258.258 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.4.attn.attn_drop: training=True 2022-03-16 04:44:28,258.258 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.4.attn.proj: training=True 2022-03-16 04:44:28,259.259 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.4.attn.proj_drop: training=True 2022-03-16 04:44:28,259.259 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.4.drop_path: training=True 2022-03-16 04:44:28,259.259 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.4.norm2: training=True 2022-03-16 04:44:28,259.259 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.4.mlp: training=True 2022-03-16 04:44:28,259.259 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.4.mlp.fc1: training=True 2022-03-16 04:44:28,259.259 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.4.mlp.act: training=True 2022-03-16 04:44:28,259.259 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.4.mlp.fc2: training=True 2022-03-16 04:44:28,259.259 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.4.mlp.drop: training=True 2022-03-16 04:44:28,259.259 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.5: training=True 2022-03-16 04:44:28,259.259 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.5.norm1: training=True 2022-03-16 04:44:28,259.259 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.5.attn: training=True 2022-03-16 04:44:28,259.259 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.5.attn.qkv: training=True 2022-03-16 04:44:28,260.260 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.5.attn.attn_drop: training=True 2022-03-16 04:44:28,260.260 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.5.attn.proj: training=True 2022-03-16 04:44:28,260.260 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.5.attn.proj_drop: training=True 2022-03-16 04:44:28,260.260 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.5.drop_path: training=True 2022-03-16 04:44:28,260.260 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.5.norm2: training=True 2022-03-16 04:44:28,260.260 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.5.mlp: training=True 2022-03-16 04:44:28,260.260 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.5.mlp.fc1: training=True 2022-03-16 04:44:28,260.260 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.5.mlp.act: training=True 2022-03-16 04:44:28,260.260 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.5.mlp.fc2: training=True 2022-03-16 04:44:28,260.260 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.5.mlp.drop: training=True 2022-03-16 04:44:28,260.260 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.6: training=True 2022-03-16 04:44:28,261.261 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.6.norm1: training=True 2022-03-16 04:44:28,261.261 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.6.attn: training=True 2022-03-16 04:44:28,261.261 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.6.attn.qkv: training=True 2022-03-16 04:44:28,261.261 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.6.attn.attn_drop: training=True 2022-03-16 04:44:28,261.261 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.6.attn.proj: training=True 2022-03-16 04:44:28,261.261 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.6.attn.proj_drop: training=True 2022-03-16 04:44:28,261.261 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.6.drop_path: training=True 2022-03-16 04:44:28,261.261 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.6.norm2: training=True 2022-03-16 04:44:28,261.261 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.6.mlp: training=True 2022-03-16 04:44:28,261.261 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.6.mlp.fc1: training=True 2022-03-16 04:44:28,261.261 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.6.mlp.act: training=True 2022-03-16 04:44:28,261.261 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.6.mlp.fc2: training=True 2022-03-16 04:44:28,261.261 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.6.mlp.drop: training=True 2022-03-16 04:44:28,261.261 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.7: training=True 2022-03-16 04:44:28,262.262 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.7.norm1: training=True 2022-03-16 04:44:28,262.262 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.7.attn: training=True 2022-03-16 04:44:28,262.262 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.7.attn.qkv: training=True 2022-03-16 04:44:28,262.262 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.7.attn.attn_drop: training=True 2022-03-16 04:44:28,262.262 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.7.attn.proj: training=True 2022-03-16 04:44:28,262.262 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.7.attn.proj_drop: training=True 2022-03-16 04:44:28,262.262 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.7.drop_path: training=True 2022-03-16 04:44:28,262.262 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.7.norm2: training=True 2022-03-16 04:44:28,262.262 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.7.mlp: training=True 2022-03-16 04:44:28,262.262 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.7.mlp.fc1: training=True 2022-03-16 04:44:28,262.262 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.7.mlp.act: training=True 2022-03-16 04:44:28,262.262 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.7.mlp.fc2: training=True 2022-03-16 04:44:28,262.262 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.7.mlp.drop: training=True 2022-03-16 04:44:28,262.262 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.8: training=True 2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.8.norm1: training=True 2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.8.attn: training=True 2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.8.attn.qkv: training=True 2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.8.attn.attn_drop: training=True 2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.8.attn.proj: training=True 2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.8.attn.proj_drop: training=True 2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.8.drop_path: training=True 2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.8.norm2: training=True 2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.8.mlp: training=True 2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.8.mlp.fc1: training=True 2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.8.mlp.act: training=True 2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.8.mlp.fc2: training=True 2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.8.mlp.drop: training=True 2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.9: training=True 2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.9.norm1: training=True 2022-03-16 04:44:28,263.263 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.9.attn: training=True 2022-03-16 04:44:28,264.264 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.9.attn.qkv: training=True 2022-03-16 04:44:28,264.264 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.9.attn.attn_drop: training=True 2022-03-16 04:44:28,264.264 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.9.attn.proj: training=True 2022-03-16 04:44:28,264.264 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.9.attn.proj_drop: training=True 2022-03-16 04:44:28,264.264 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.9.drop_path: training=True 2022-03-16 04:44:28,264.264 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.9.norm2: training=True 2022-03-16 04:44:28,264.264 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.9.mlp: training=True 2022-03-16 04:44:28,264.264 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.9.mlp.fc1: training=True 2022-03-16 04:44:28,264.264 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.9.mlp.act: training=True 2022-03-16 04:44:28,264.264 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.9.mlp.fc2: training=True 2022-03-16 04:44:28,264.264 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.9.mlp.drop: training=True 2022-03-16 04:44:28,264.264 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.10: training=True 2022-03-16 04:44:28,264.264 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.10.norm1: training=True 2022-03-16 04:44:28,264.264 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.10.attn: training=True 2022-03-16 04:44:28,265.265 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.10.attn.qkv: training=True 2022-03-16 04:44:28,265.265 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.10.attn.attn_drop: training=True 2022-03-16 04:44:28,265.265 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.10.attn.proj: training=True 2022-03-16 04:44:28,265.265 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.10.attn.proj_drop: training=True 2022-03-16 04:44:28,265.265 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.10.drop_path: training=True 2022-03-16 04:44:28,265.265 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.10.norm2: training=True 2022-03-16 04:44:28,265.265 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.10.mlp: training=True 2022-03-16 04:44:28,265.265 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.10.mlp.fc1: training=True 2022-03-16 04:44:28,265.265 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.10.mlp.act: training=True 2022-03-16 04:44:28,265.265 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.10.mlp.fc2: training=True 2022-03-16 04:44:28,265.265 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.10.mlp.drop: training=True 2022-03-16 04:44:28,265.265 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.11: training=True 2022-03-16 04:44:28,265.265 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.11.norm1: training=True 2022-03-16 04:44:28,266.266 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.11.attn: training=True 2022-03-16 04:44:28,266.266 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.11.attn.qkv: training=True 2022-03-16 04:44:28,266.266 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.11.attn.attn_drop: training=True 2022-03-16 04:44:28,266.266 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.11.attn.proj: training=True 2022-03-16 04:44:28,266.266 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.11.attn.proj_drop: training=True 2022-03-16 04:44:28,266.266 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.11.drop_path: training=True 2022-03-16 04:44:28,266.266 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.11.norm2: training=True 2022-03-16 04:44:28,266.266 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.11.mlp: training=True 2022-03-16 04:44:28,266.266 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.11.mlp.fc1: training=True 2022-03-16 04:44:28,266.266 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.11.mlp.act: training=True 2022-03-16 04:44:28,266.266 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.11.mlp.fc2: training=True 2022-03-16 04:44:28,266.266 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.blocks.11.mlp.drop: training=True 2022-03-16 04:44:28,266.266 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks: training=True 2022-03-16 04:44:28,267.267 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.0: training=True 2022-03-16 04:44:28,267.267 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.0.norm1: training=True 2022-03-16 04:44:28,267.267 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.0.attn: training=True 2022-03-16 04:44:28,267.267 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.0.attn.qkv: training=True 2022-03-16 04:44:28,267.267 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.0.attn.attn_drop: training=True 2022-03-16 04:44:28,267.267 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.0.attn.proj: training=True 2022-03-16 04:44:28,267.267 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.0.attn.proj_drop: training=True 2022-03-16 04:44:28,267.267 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.0.drop_path: training=True 2022-03-16 04:44:28,267.267 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.0.norm2: training=True 2022-03-16 04:44:28,267.267 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.0.mlp: training=True 2022-03-16 04:44:28,268.268 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.0.mlp.fc1: training=True 2022-03-16 04:44:28,268.268 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.0.mlp.act: training=True 2022-03-16 04:44:28,268.268 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.0.mlp.fc2: training=True 2022-03-16 04:44:28,268.268 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.0.mlp.drop: training=True 2022-03-16 04:44:28,268.268 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.1: training=True 2022-03-16 04:44:28,268.268 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.1.norm1: training=True 2022-03-16 04:44:28,268.268 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.1.attn: training=True 2022-03-16 04:44:28,268.268 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.1.attn.qkv: training=True 2022-03-16 04:44:28,268.268 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.1.attn.attn_drop: training=True 2022-03-16 04:44:28,268.268 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.1.attn.proj: training=True 2022-03-16 04:44:28,268.268 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.1.attn.proj_drop: training=True 2022-03-16 04:44:28,268.268 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.1.drop_path: training=True 2022-03-16 04:44:28,268.268 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.1.norm2: training=True 2022-03-16 04:44:28,269.269 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.1.mlp: training=True 2022-03-16 04:44:28,269.269 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.1.mlp.fc1: training=True 2022-03-16 04:44:28,269.269 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.1.mlp.act: training=True 2022-03-16 04:44:28,269.269 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.1.mlp.fc2: training=True 2022-03-16 04:44:28,269.269 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.1.mlp.drop: training=True 2022-03-16 04:44:28,269.269 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.2: training=True 2022-03-16 04:44:28,269.269 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.2.norm1: training=True 2022-03-16 04:44:28,269.269 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.2.attn: training=True 2022-03-16 04:44:28,269.269 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.2.attn.qkv: training=True 2022-03-16 04:44:28,269.269 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.2.attn.attn_drop: training=True 2022-03-16 04:44:28,269.269 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.2.attn.proj: training=True 2022-03-16 04:44:28,269.269 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.2.attn.proj_drop: training=True 2022-03-16 04:44:28,269.269 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.2.drop_path: training=True 2022-03-16 04:44:28,270.270 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.2.norm2: training=True 2022-03-16 04:44:28,270.270 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.2.mlp: training=True 2022-03-16 04:44:28,270.270 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.2.mlp.fc1: training=True 2022-03-16 04:44:28,270.270 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.2.mlp.act: training=True 2022-03-16 04:44:28,270.270 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.2.mlp.fc2: training=True 2022-03-16 04:44:28,270.270 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.2.mlp.drop: training=True 2022-03-16 04:44:28,270.270 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.3: training=True 2022-03-16 04:44:28,270.270 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.3.norm1: training=True 2022-03-16 04:44:28,270.270 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.3.attn: training=True 2022-03-16 04:44:28,270.270 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.3.attn.qkv: training=True 2022-03-16 04:44:28,270.270 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.3.attn.attn_drop: training=True 2022-03-16 04:44:28,271.271 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.3.attn.proj: training=True 2022-03-16 04:44:28,271.271 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.3.attn.proj_drop: training=True 2022-03-16 04:44:28,271.271 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.3.drop_path: training=True 2022-03-16 04:44:28,271.271 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.3.norm2: training=True 2022-03-16 04:44:28,271.271 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.3.mlp: training=True 2022-03-16 04:44:28,271.271 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.3.mlp.fc1: training=True 2022-03-16 04:44:28,271.271 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.3.mlp.act: training=True 2022-03-16 04:44:28,271.271 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.3.mlp.fc2: training=True 2022-03-16 04:44:28,271.271 2829:uni_pipeline.py:744 do_train(): module.module.bert.encoder.tag_blocks.3.mlp.drop: training=True 2022-03-16 04:44:28,271.271 2829:uni_pipeline.py:744 do_train(): module.module.bert.caption_pooler: training=True 2022-03-16 04:44:28,271.271 2829:uni_pipeline.py:744 do_train(): module.module.bert.caption_pooler.dense: training=True 2022-03-16 04:44:28,271.271 2829:uni_pipeline.py:744 do_train(): module.module.bert.caption_pooler.activation: training=True 2022-03-16 04:44:28,272.272 2829:uni_pipeline.py:744 do_train(): module.module.bert.pooler: training=True 2022-03-16 04:44:28,272.272 2829:uni_pipeline.py:744 do_train(): module.module.bert.pooler.dense: training=True 2022-03-16 04:44:28,272.272 2829:uni_pipeline.py:744 do_train(): module.module.bert.pooler.activation: training=True 2022-03-16 04:44:28,272.272 2829:uni_pipeline.py:744 do_train(): module.module.bert.tag_logit: training=True 2022-03-16 04:44:28,272.272 2829:uni_pipeline.py:744 do_train(): module.module.bert.tag_logit.predictions: training=True 2022-03-16 04:44:28,272.272 2829:uni_pipeline.py:744 do_train(): module.module.bert.tag_logit.predictions.transform: training=True 2022-03-16 04:44:28,272.272 2829:uni_pipeline.py:744 do_train(): module.module.bert.tag_logit.predictions.transform.dense: training=True 2022-03-16 04:44:28,272.272 2829:uni_pipeline.py:744 do_train(): module.module.bert.tag_logit.predictions.transform.LayerNorm: training=True 2022-03-16 04:44:28,272.272 2829:uni_pipeline.py:744 do_train(): module.module.bert.tag_logit.predictions.decoder: training=True 2022-03-16 04:44:28,272.272 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder: training=True 2022-03-16 04:44:28,272.272 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer: training=True 2022-03-16 04:44:28,272.272 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.0: training=True 2022-03-16 04:44:28,272.272 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.0.attention: training=True 2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.0.attention.self: training=True 2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.0.attention.self.query: training=True 2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.0.attention.self.key: training=True 2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.0.attention.self.value: training=True 2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.0.attention.self.dropout: training=True 2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.0.attention.output: training=True 2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.0.attention.output.dense: training=True 2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.0.attention.output.LayerNorm: training=True 2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.0.attention.output.dropout: training=True 2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.0.intermediate: training=True 2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.0.intermediate.dense: training=True 2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.0.output: training=True 2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.0.output.dense: training=True 2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.0.output.LayerNorm: training=True 2022-03-16 04:44:28,273.273 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.0.output.dropout: training=True 2022-03-16 04:44:28,274.274 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.1: training=True 2022-03-16 04:44:28,274.274 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.1.attention: training=True 2022-03-16 04:44:28,274.274 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.1.attention.self: training=True 2022-03-16 04:44:28,274.274 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.1.attention.self.query: training=True 2022-03-16 04:44:28,274.274 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.1.attention.self.key: training=True 2022-03-16 04:44:28,274.274 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.1.attention.self.value: training=True 2022-03-16 04:44:28,274.274 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.1.attention.self.dropout: training=True 2022-03-16 04:44:28,274.274 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.1.attention.output: training=True 2022-03-16 04:44:28,274.274 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.1.attention.output.dense: training=True 2022-03-16 04:44:28,274.274 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.1.attention.output.LayerNorm: training=True 2022-03-16 04:44:28,274.274 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.1.attention.output.dropout: training=True 2022-03-16 04:44:28,275.275 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.1.intermediate: training=True 2022-03-16 04:44:28,275.275 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.1.intermediate.dense: training=True 2022-03-16 04:44:28,275.275 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.1.output: training=True 2022-03-16 04:44:28,275.275 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.1.output.dense: training=True 2022-03-16 04:44:28,275.275 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.1.output.LayerNorm: training=True 2022-03-16 04:44:28,275.275 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.1.output.dropout: training=True 2022-03-16 04:44:28,275.275 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.2: training=True 2022-03-16 04:44:28,275.275 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.2.attention: training=True 2022-03-16 04:44:28,275.275 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.2.attention.self: training=True 2022-03-16 04:44:28,276.276 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.2.attention.self.query: training=True 2022-03-16 04:44:28,276.276 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.2.attention.self.key: training=True 2022-03-16 04:44:28,276.276 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.2.attention.self.value: training=True 2022-03-16 04:44:28,276.276 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.2.attention.self.dropout: training=True 2022-03-16 04:44:28,276.276 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.2.attention.output: training=True 2022-03-16 04:44:28,276.276 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.2.attention.output.dense: training=True 2022-03-16 04:44:28,276.276 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.2.attention.output.LayerNorm: training=True 2022-03-16 04:44:28,276.276 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.2.attention.output.dropout: training=True 2022-03-16 04:44:28,276.276 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.2.intermediate: training=True 2022-03-16 04:44:28,277.277 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.2.intermediate.dense: training=True 2022-03-16 04:44:28,277.277 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.2.output: training=True 2022-03-16 04:44:28,277.277 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.2.output.dense: training=True 2022-03-16 04:44:28,277.277 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.2.output.LayerNorm: training=True 2022-03-16 04:44:28,277.277 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.2.output.dropout: training=True 2022-03-16 04:44:28,277.277 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.3: training=True 2022-03-16 04:44:28,277.277 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.3.attention: training=True 2022-03-16 04:44:28,277.277 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.3.attention.self: training=True 2022-03-16 04:44:28,277.277 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.3.attention.self.query: training=True 2022-03-16 04:44:28,277.277 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.3.attention.self.key: training=True 2022-03-16 04:44:28,278.278 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.3.attention.self.value: training=True 2022-03-16 04:44:28,278.278 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.3.attention.self.dropout: training=True 2022-03-16 04:44:28,278.278 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.3.attention.output: training=True 2022-03-16 04:44:28,278.278 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.3.attention.output.dense: training=True 2022-03-16 04:44:28,278.278 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.3.attention.output.LayerNorm: training=True 2022-03-16 04:44:28,278.278 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.3.attention.output.dropout: training=True 2022-03-16 04:44:28,278.278 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.3.intermediate: training=True 2022-03-16 04:44:28,278.278 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.3.intermediate.dense: training=True 2022-03-16 04:44:28,278.278 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.3.output: training=True 2022-03-16 04:44:28,278.278 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.3.output.dense: training=True 2022-03-16 04:44:28,278.278 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.3.output.LayerNorm: training=True 2022-03-16 04:44:28,279.279 2829:uni_pipeline.py:744 do_train(): module.module.bert.decoder.layer.3.output.dropout: training=True 2022-03-16 04:44:28,279.279 2829:uni_pipeline.py:744 do_train(): module.module.bert.img_embedding: training=True 2022-03-16 04:44:28,279.279 2829:uni_pipeline.py:744 do_train(): module.module.bert.dropout: training=True 2022-03-16 04:44:28,279.279 2829:uni_pipeline.py:744 do_train(): module.module.cls: training=True 2022-03-16 04:44:28,279.279 2829:uni_pipeline.py:744 do_train(): module.module.cls.predictions: training=True 2022-03-16 04:44:28,279.279 2829:uni_pipeline.py:744 do_train(): module.module.cls.predictions.transform: training=True 2022-03-16 04:44:28,279.279 2829:uni_pipeline.py:744 do_train(): module.module.cls.predictions.transform.dense: training=True 2022-03-16 04:44:28,279.279 2829:uni_pipeline.py:744 do_train(): module.module.cls.predictions.transform.LayerNorm: training=True 2022-03-16 04:44:28,279.279 2829:uni_pipeline.py:744 do_train(): module.module.cls.predictions.decoder: training=True 2022-03-16 04:44:28,279.279 2829:uni_pipeline.py:744 do_train(): module.module.loss: training=True 2022-03-16 04:44:28,279.279 2829:uni_pipeline.py:744 do_train(): module.module.loss.log_soft: training=True 2022-03-16 04:44:28,279.279 2829:uni_pipeline.py:744 do_train(): module.module.loss.kl: training=True 2022-03-16 04:44:28,279.279 2829:uni_pipeline.py:744 do_train(): module.module.tag_loss: training=True 2022-03-16 04:44:28,280.280 2829:uni_pipeline.py:744 do_train(): module.image_encoder: training=True 2022-03-16 04:44:28,280.280 2829:uni_pipeline.py:744 do_train(): module.image_encoder.module: training=True 2022-03-16 04:44:28,280.280 2829:uni_pipeline.py:744 do_train(): module.image_encoder.module.patch_embed: training=True 2022-03-16 04:44:28,280.280 2829:uni_pipeline.py:744 do_train(): module.image_encoder.module.patch_embed.proj: training=True 2022-03-16 04:44:28,280.280 2829:uni_pipeline.py:744 do_train(): module.image_encoder.module.pos_drop: training=True 2022-03-16 04:44:28,280.280 2829:uni_pipeline.py:744 do_train(): module.image_encoder.module.blocks: training=True 2022-03-16 04:44:28,280.280 2829:uni_pipeline.py:744 do_train(): module.image_encoder.module.norm: training=True 2022-03-16 04:44:28,280.280 2829:uni_pipeline.py:744 do_train(): module.image_encoder.module.pre_logits: training=True 2022-03-16 04:44:28,280.280 2829:uni_pipeline.py:744 do_train(): module.image_encoder.module.head: training=True 2022-03-16 04:44:28,280.280 2829:uni_pipeline.py:745 do_train(): dataset = DatasetPlusTransform(dataset=CaptionIdxTSVDataset(data=TaxCocoCaption, split=train, caption_version=None), transform=Compose( Compose( ImageTransform2Dict(image_transform=Compose( ToPILImage() RandomResizedCrop(size=(384, 384), scale=(0.08, 1.0), ratio=(0.75, 1.3333), interpolation=PIL.Image.BILINEAR) ColorJitter(brightness=[0.6, 1.4], contrast=[0.6, 1.4], saturation=[0.6, 1.4], hue=None) RandomHorizontalFlip(p=0.5) ToTensor() Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) )) ) LoadCaption(tsv=TSVSplitProperty(tsv=CompositeTSVFile(list_file=data/TaxCocoCaption/train.shuffle.txt, seq_file=data/TaxCocoCaption/trainX.caption.tsv))) LoadLabel(data=TaxCocoCaption, split=train, version=vinvl) TransCaptionTensorizer(tensorizer=, pad_to_max=True, pad_image_to_max=True) )) 2022-03-16 04:44:28,281.281 2829:trainer.py:367 do_train_dict(): Start training 2022-03-16 04:44:28,283.283 2829:qd_common.py:3452 print_frame_info(): func name = do_train_dict; model = DistributedDataParallel( (module): ImageCaptioning( (module): TaggerEncDecSplitForImageCaptioning( (bert): TaggerEncDecCLSEmbSplitBertImgModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(30522, 768, padding_idx=0) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0, inplace=False) ) (extra_embeddings): BertEmbeddings( (word_embeddings): Embedding(30522, 768, padding_idx=0) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0, inplace=False) ) (encoder): TIMMVitSplitEncoder( (blocks): ModuleList( (0): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (1): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (2): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (3): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (4): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (5): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (6): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (7): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (8): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (9): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (10): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (11): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) (tag_blocks): ModuleList( (0): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (1): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (2): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) (3): Block( (norm1): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (attn): Attention( (qkv): Linear(in_features=768, out_features=2304, bias=True) (attn_drop): Dropout(p=0.0, inplace=False) (proj): Linear(in_features=768, out_features=768, bias=True) (proj_drop): Dropout(p=0.0, inplace=False) ) (drop_path): Identity() (norm2): LayerNorm((768,), eps=1e-06, elementwise_affine=True) (mlp): Mlp( (fc1): Linear(in_features=768, out_features=3072, bias=True) (act): GELU() (fc2): Linear(in_features=3072, out_features=768, bias=True) (drop): Dropout(p=0.0, inplace=False) ) ) ) ) (caption_pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) (tag_logit): BertCaptioningHeads( (predictions): BertLMPredictionHead( (transform): BertPredictionHeadTransform( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) ) (decoder): Linear(in_features=768, out_features=30522, bias=False) ) ) (decoder): BertEncoder( (layer): ModuleList( (0): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0, inplace=False) ) ) (1): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0, inplace=False) ) ) (2): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0, inplace=False) ) ) (3): BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0, inplace=False) ) ) ) ) (img_embedding): Identity() (dropout): Identity() ) (cls): BertCaptioningHeads( (predictions): BertLMPredictionHead( (transform): BertPredictionHeadTransform( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) ) (decoder): Linear(in_features=768, out_features=30522, bias=False) ) ) (loss): BertCaptioningLoss( (log_soft): LogSoftmax(dim=1) (kl): KLDivLoss() ) (tag_loss): FocalLossWithLogitsNegLoss(alpha=0.5, gamma=1) ) (image_encoder): InputAsDict( (module): VisionTransformer( (patch_embed): PatchEmbed( (proj): Conv2d(3, 768, kernel_size=(16, 16), stride=(16, 16)) ) (pos_drop): Dropout(p=0.0, inplace=False) (blocks): ModuleList() (norm): Identity() (pre_logits): Identity() (head): Linear(in_features=768, out_features=1000, bias=True) ) ) ) ); data_loader = ; optimizer = AdamW ( Parameter Group 0 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 initial_lr: 0.0001 lr: 0.0001 param_names: ['module.cls_token', 'module.pos_embed', 'module.patch_embed.proj.weight', 'module.head.weight'] weight_decay: 0.05 Parameter Group 1 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 initial_lr: 0.0001 lr: 0.0001 param_names: ['module.patch_embed.proj.bias', 'module.head.bias'] weight_decay: 0.0 Parameter Group 2 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 initial_lr: 0.0001 lr: 0.0001 param_names: ['word_embeddings.weight', 'position_embeddings.weight', 'token_type_embeddings.weight'] weight_decay: 0.05 Parameter Group 3 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 initial_lr: 0.0001 lr: 0.0001 param_names: ['LayerNorm.weight', 'LayerNorm.bias'] weight_decay: 0 Parameter Group 4 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 initial_lr: 1e-05 lr: 1e-05 param_names: ['0.norm1.weight', '0.attn.qkv.weight', '0.attn.proj.weight', '0.norm2.weight', '0.mlp.fc1.weight', '0.mlp.fc2.weight', '1.norm1.weight', '1.attn.qkv.weight', '1.attn.proj.weight', '1.norm2.weight', '1.mlp.fc1.weight', '1.mlp.fc2.weight', '2.norm1.weight', '2.attn.qkv.weight', '2.attn.proj.weight', '2.norm2.weight', '2.mlp.fc1.weight', '2.mlp.fc2.weight', '3.norm1.weight', '3.attn.qkv.weight', '3.attn.proj.weight', '3.norm2.weight', '3.mlp.fc1.weight', '3.mlp.fc2.weight', '4.norm1.weight', '4.attn.qkv.weight', '4.attn.proj.weight', '4.norm2.weight', '4.mlp.fc1.weight', '4.mlp.fc2.weight', '5.norm1.weight', '5.attn.qkv.weight', '5.attn.proj.weight', '5.norm2.weight', '5.mlp.fc1.weight', '5.mlp.fc2.weight', '6.norm1.weight', '6.attn.qkv.weight', '6.attn.proj.weight', '6.norm2.weight', '6.mlp.fc1.weight', '6.mlp.fc2.weight', '7.norm1.weight', '7.attn.qkv.weight', '7.attn.proj.weight', '7.norm2.weight', '7.mlp.fc1.weight', '7.mlp.fc2.weight'] weight_decay: 0.05 Parameter Group 5 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 initial_lr: 1e-05 lr: 1e-05 param_names: ['0.norm1.bias', '0.attn.qkv.bias', '0.attn.proj.bias', '0.norm2.bias', '0.mlp.fc1.bias', '0.mlp.fc2.bias', '1.norm1.bias', '1.attn.qkv.bias', '1.attn.proj.bias', '1.norm2.bias', '1.mlp.fc1.bias', '1.mlp.fc2.bias', '2.norm1.bias', '2.attn.qkv.bias', '2.attn.proj.bias', '2.norm2.bias', '2.mlp.fc1.bias', '2.mlp.fc2.bias', '3.norm1.bias', '3.attn.qkv.bias', '3.attn.proj.bias', '3.norm2.bias', '3.mlp.fc1.bias', '3.mlp.fc2.bias', '4.norm1.bias', '4.attn.qkv.bias', '4.attn.proj.bias', '4.norm2.bias', '4.mlp.fc1.bias', '4.mlp.fc2.bias', '5.norm1.bias', '5.attn.qkv.bias', '5.attn.proj.bias', '5.norm2.bias', '5.mlp.fc1.bias', '5.mlp.fc2.bias', '6.norm1.bias', '6.attn.qkv.bias', '6.attn.proj.bias', '6.norm2.bias', '6.mlp.fc1.bias', '6.mlp.fc2.bias', '7.norm1.bias', '7.attn.qkv.bias', '7.attn.proj.bias', '7.norm2.bias', '7.mlp.fc1.bias', '7.mlp.fc2.bias'] weight_decay: 0.0 Parameter Group 6 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 initial_lr: 0.0001 lr: 0.0001 param_names: ['0.norm1.weight', '0.attn.qkv.weight', '0.attn.proj.weight', '0.norm2.weight', '0.mlp.fc1.weight', '0.mlp.fc2.weight', '1.norm1.weight', '1.attn.qkv.weight', '1.attn.proj.weight', '1.norm2.weight', '1.mlp.fc1.weight', '1.mlp.fc2.weight', '2.norm1.weight', '2.attn.qkv.weight', '2.attn.proj.weight', '2.norm2.weight', '2.mlp.fc1.weight', '2.mlp.fc2.weight', '3.norm1.weight', '3.attn.qkv.weight', '3.attn.proj.weight', '3.norm2.weight', '3.mlp.fc1.weight', '3.mlp.fc2.weight'] weight_decay: 0.05 Parameter Group 7 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 initial_lr: 0.0001 lr: 0.0001 param_names: ['0.norm1.bias', '0.attn.qkv.bias', '0.attn.proj.bias', '0.norm2.bias', '0.mlp.fc1.bias', '0.mlp.fc2.bias', '1.norm1.bias', '1.attn.qkv.bias', '1.attn.proj.bias', '1.norm2.bias', '1.mlp.fc1.bias', '1.mlp.fc2.bias', '2.norm1.bias', '2.attn.qkv.bias', '2.attn.proj.bias', '2.norm2.bias', '2.mlp.fc1.bias', '2.mlp.fc2.bias', '3.norm1.bias', '3.attn.qkv.bias', '3.attn.proj.bias', '3.norm2.bias', '3.mlp.fc1.bias', '3.mlp.fc2.bias'] weight_decay: 0.0 Parameter Group 8 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 initial_lr: 1e-05 lr: 1e-05 param_names: ['0.norm1.weight', '0.attn.qkv.weight', '0.attn.proj.weight', '0.norm2.weight', '0.mlp.fc1.weight', '0.mlp.fc2.weight', '1.norm1.weight', '1.attn.qkv.weight', '1.attn.proj.weight', '1.norm2.weight', '1.mlp.fc1.weight', '1.mlp.fc2.weight', '2.norm1.weight', '2.attn.qkv.weight', '2.attn.proj.weight', '2.norm2.weight', '2.mlp.fc1.weight', '2.mlp.fc2.weight', '3.norm1.weight', '3.attn.qkv.weight', '3.attn.proj.weight', '3.norm2.weight', '3.mlp.fc1.weight', '3.mlp.fc2.weight'] weight_decay: 0.05 Parameter Group 9 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 initial_lr: 1e-05 lr: 1e-05 param_names: ['0.norm1.bias', '0.attn.qkv.bias', '0.attn.proj.bias', '0.norm2.bias', '0.mlp.fc1.bias', '0.mlp.fc2.bias', '1.norm1.bias', '1.attn.qkv.bias', '1.attn.proj.bias', '1.norm2.bias', '1.mlp.fc1.bias', '1.mlp.fc2.bias', '2.norm1.bias', '2.attn.qkv.bias', '2.attn.proj.bias', '2.norm2.bias', '2.mlp.fc1.bias', '2.mlp.fc2.bias', '3.norm1.bias', '3.attn.qkv.bias', '3.attn.proj.bias', '3.norm2.bias', '3.mlp.fc1.bias', '3.mlp.fc2.bias'] weight_decay: 0.0 Parameter Group 10 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 initial_lr: 0.0001 lr: 0.0001 param_names: ['dense.weight'] weight_decay: 0.05 Parameter Group 11 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 initial_lr: 0.0001 lr: 0.0001 param_names: ['dense.bias'] weight_decay: 0.0 Parameter Group 12 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 initial_lr: 1e-05 lr: 1e-05 param_names: ['dense.weight'] weight_decay: 0.05 Parameter Group 13 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 initial_lr: 1e-05 lr: 1e-05 param_names: ['dense.bias'] weight_decay: 0.0 Parameter Group 14 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 initial_lr: 1e-05 lr: 1e-05 param_names: ['predictions.bias', 'predictions.transform.dense.bias', 'predictions.transform.LayerNorm.weight', 'predictions.transform.LayerNorm.bias'] weight_decay: 0.0 Parameter Group 15 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 initial_lr: 1e-05 lr: 1e-05 param_names: ['predictions.transform.dense.weight', 'predictions.decoder.weight'] weight_decay: 0.05 Parameter Group 16 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 initial_lr: 0.0001 lr: 0.0001 param_names: ['layer.0.attention.self.query.weight', 'layer.0.attention.self.key.weight', 'layer.0.attention.self.value.weight', 'layer.0.attention.output.dense.weight', 'layer.0.intermediate.dense.weight', 'layer.0.output.dense.weight', 'layer.1.attention.self.query.weight', 'layer.1.attention.self.key.weight', 'layer.1.attention.self.value.weight', 'layer.1.attention.output.dense.weight', 'layer.1.intermediate.dense.weight', 'layer.1.output.dense.weight', 'layer.2.attention.self.query.weight', 'layer.2.attention.self.key.weight', 'layer.2.attention.self.value.weight', 'layer.2.attention.output.dense.weight', 'layer.2.intermediate.dense.weight', 'layer.2.output.dense.weight', 'layer.3.attention.self.query.weight', 'layer.3.attention.self.key.weight', 'layer.3.attention.self.value.weight', 'layer.3.attention.output.dense.weight', 'layer.3.intermediate.dense.weight', 'layer.3.output.dense.weight'] weight_decay: 0.05 Parameter Group 17 betas: (0.9, 0.999) correct_bias: True eps: 1e-08 initial_lr: 0.0001 lr: 0.0001 param_names: ['layer.0.attention.self.query.bias', 'layer.0.attention.self.key.bias', 'layer.0.attention.self.value.bias', 'layer.0.attention.output.dense.bias', 'layer.0.attention.output.LayerNorm.weight', 'layer.0.attention.output.LayerNorm.bias', 'layer.0.intermediate.dense.bias', 'layer.0.output.dense.bias', 'layer.0.output.LayerNorm.weight', 'layer.0.output.LayerNorm.bias', 'layer.1.attention.self.query.bias', 'layer.1.attention.self.key.bias', 'layer.1.attention.self.value.bias', 'layer.1.attention.output.dense.bias', 'layer.1.attention.output.LayerNorm.weight', 'layer.1.attention.output.LayerNorm.bias', 'layer.1.intermediate.dense.bias', 'layer.1.output.dense.bias', 'layer.1.output.LayerNorm.weight', 'layer.1.output.LayerNorm.bias', 'layer.2.attention.self.query.bias', 'layer.2.attention.self.key.bias', 'layer.2.attention.self.value.bias', 'layer.2.attention.output.dense.bias', 'layer.2.attention.output.LayerNorm.weight', 'layer.2.attention.output.LayerNorm.bias', 'layer.2.intermediate.dense.bias', 'layer.2.output.dense.bias', 'layer.2.output.LayerNorm.weight', 'layer.2.output.LayerNorm.bias', 'layer.3.attention.self.query.bias', 'layer.3.attention.self.key.bias', 'layer.3.attention.self.value.bias', 'layer.3.attention.output.dense.bias', 'layer.3.attention.output.LayerNorm.weight', 'layer.3.attention.output.LayerNorm.bias', 'layer.3.intermediate.dense.bias', 'layer.3.output.dense.bias', 'layer.3.output.LayerNorm.weight', 'layer.3.output.LayerNorm.bias'] weight_decay: 0.0 ); scheduler = ; checkpointer = ; device = cuda; checkpoint_period = 5000; arguments = {'iteration': 0}; log_step = 100; data_partition = 1; explicit_average_grad = False; no_update = False; ema = None; use_amp = False; gradient_clip = 1.0; model_sub_name_fn = /opt/conda/lib/python3.7/site-packages/torch/optim/lr_scheduler.py:216: UserWarning: Please also save or load the state of the optimizer when saving or loading the scheduler. warnings.warn(SAVE_STATE_WARNING, UserWarning) 2022-03-16 04:44:28,287.287 2829:checkpoint.py:222 save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0000000.pt 2022-03-16 04:45:52,906.906 3427:tsv_io.py:129 __getitem__(): too long to load fname = TSVFile(tsv_file='data/coco_caption/train.img.tsv'), source=0, row=44475 2022-03-16 04:45:52,906.906 3424:tsv_io.py:129 __getitem__(): too long to load fname = TSVFile(tsv_file='data/coco_caption/train.img.tsv'), source=0, row=7048 2022-03-16 04:45:52,906.906 3425:tsv_io.py:129 __getitem__(): too long to load fname = TSVFile(tsv_file='data/coco_caption/train.img.tsv'), source=0, row=12705 2022-03-16 04:45:52,907.907 3430:tsv_io.py:129 __getitem__(): too long to load fname = TSVFile(tsv_file='data/coco_caption/train.img.tsv'), source=0, row=91679 2022-03-16 04:45:52,907.907 3426:tsv_io.py:129 __getitem__(): too long to load fname = TSVFile(tsv_file='data/coco_caption/train.img.tsv'), source=0, row=25979 2022-03-16 04:45:52,907.907 3429:tsv_io.py:129 __getitem__(): too long to load fname = TSVFile(tsv_file='data/coco_caption/train.img.tsv'), source=0, row=94718 2022-03-16 04:45:52,908.908 3428:tsv_io.py:129 __getitem__(): too long to load fname = TSVFile(tsv_file='data/coco_caption/train.img.tsv'), source=0, row=25521 2022-03-16 04:45:52,908.908 3431:tsv_io.py:129 __getitem__(): too long to load fname = TSVFile(tsv_file='data/coco_caption/train.img.tsv'), source=0, row=47118 /opt/conda/lib/python3.7/site-packages/torch/nn/functional.py:1639: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead. warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.") 2022-03-16 04:46:01,792.792 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.0 2022-03-16 04:46:01,792.792 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 212.95741271972656 2022-03-16 04:46:01,792.792 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.28994750976562 2022-03-16 04:46:04,247.247 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.004969404079020023 2022-03-16 04:46:04,247.247 2829:tagger_caption_uni_pipeline_expanding.py:416 forward(): # of tokens = 577 2022-03-16 04:46:04,248.248 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 04:46:04,249.249 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'ferry', 'boat', 'in', 'a', 'large', 'canal', 'and', 'large', 'buildings', '[MASK]', 'sobs', 'the', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 04:46:04,267.267 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['building', 'boat', 'sky', 'water', 'city', 'bridge', 'window', 'tower', 'river', 'skyscraper', 'flag', 'wake', 'person', 'cloud', 'wave', 'wall', 'crane', 'tree', 'ship', '[UNK]', 'train', 'pillar', 'dock', 'sign', 'arch', 'antenna', 'roof', 'cabin', 'top', 'harbor', 'structure', 'bus', 'bottom', 'pole', 'dome', 'ripple', 'reflection', 'spire', 'rope', 'car', 'shore', 'walkway', 'column', 'man', 'railing', 'name', 'ramp', 'door', 'base', 'mast'] 2022-03-16 04:46:20,520.520 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['city', 'water', 'building', 'large', 'river', 'person', 'bridge', 'window', 'sky', 'bus', 'boat', 'canal', 'shore', 'ferry', 'crane', 'skyscraper'] /tmp/code/src/qd/mask/solver/optimization.py:186: UserWarning: This overload of add_ is deprecated: add_(Number alpha, Tensor other) Consider using one of the following signatures instead: add_(Tensor other, *, Number alpha) (Triggered internally at /pytorch/torch/csrc/utils/python_arg_parser.cpp:882.) exp_avg.mul_(beta1).add_(1.0 - beta1, grad) 2022-03-16 04:48:55,989.989 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:22:05 iter: 100 speed: 191.3 images/sec total_norm: 181.1051 (196.3794) loss: 212.3282 (214.7145) masked_loss: 4.9153 (5.3315) tag_loss: 207.9742 (209.3830) time: 1.4309 (1.4314) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4260 (1.4265) lr: 0.000100 max mem: 26307 2022-03-16 04:48:56,351.351 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.1428571492433548 2022-03-16 04:48:56,351.351 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 198.2447967529297 2022-03-16 04:48:56,351.351 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.85773849487305 2022-03-16 04:48:58,816.816 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.0074226646684110165 2022-03-16 04:48:58,816.816 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 04:48:58,816.816 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'todd', '##ler', 'wearing', 'a', 'large', 'floppy', 'hat', 'playing', 'in', 'the', 'sand', '[MASK]', '[MASK]', 'beach', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 04:48:58,832.832 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sand', 'sky', 'foot', 'shirt', 'hand', 'handle', 'girl', 'child', 'blanket', 'hat', 'bucket', 'beach', 'leg', '[UNK]', 'person', 'baby', 'hair', 'toy', 'head', 'short', 'woman', 'boy', 'top', 'spoon', 'towel', 'flower', 'face', 'ground', 'cup', 'water', 'ocean', 'shovel', 'brush', 'dress', 'cloth', 'lid', 'bag', 'hole', 'arm', 'man', 'bed', 'container', 'food', 'chair', 'ball', 'fork', 'family', 'strap', 'toe', 'knee'] 2022-03-16 04:49:14,775.775 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'hand', 'large', 'woman', 'short', 'hair', 'girl', 'child', 'foot', 'baby', 'beach', 'sky', 'shirt', 'leg', 'object', 'handle', 'sand', 'hat', 'flower', 'blanket', 'toy', 'fork', 'towel', 'bucket', 'spoon', 'bikini', 'floppy'] 2022-03-16 04:51:38,600.600 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:10:26 iter: 200 speed: 314.9 images/sec total_norm: 142.2078 (147.6038) loss: 190.7610 (191.1786) masked_loss: 4.1353 (4.1499) tag_loss: 186.9331 (187.0288) time: 1.4345 (1.6261) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4293 (1.6210) lr: 0.000100 max mem: 26307 2022-03-16 04:51:38,960.960 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.37142857909202576 2022-03-16 04:51:38,960.960 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 187.27215576171875 2022-03-16 04:51:38,960.960 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.15105438232422 2022-03-16 04:51:41,492.492 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.008562282659113407 2022-03-16 04:51:41,492.492 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 04:51:41,493.493 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'an', '[MASK]', 'steam', 'locomotive', 'approaching', 'and', 'about', 'to', 'do', 'through', 'an', 'under', '##pass', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 04:51:41,508.508 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['train', 'track', 'engine', 'bush', 'number', 'sky', 'tree', 'gravel', 'car', 'hill', 'ground', 'smoke', 'front', 'mountain', 'steam', 'railroad', '[UNK]', 'wheel', 'building', 'roof', 'window', 'grass', 'pole', 'plant', 'chimney', 'bumper', 'person', 'house', 'photo', 'hillside', 'sign', 'light', 'platform', 'wall', 'man', 'conductor', 'bridge', 'locomotive', 'sidewalk', 'background', 'fence', 'rock', 'post', 'flower', 'snow', 'bell', 'box', 'door', 'station', 'cloud'] 2022-03-16 04:51:57,499.499 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'house', 'number', 'old', 'building', 'car', 'ground', 'rock', 'track', 'wall', 'hill', 'mountain', 'engine', 'train', 'tree', 'sign', 'sky', 'shadow', 'wheel', 'steam', 'smoke', 'bush', 'blind', 'locomotive', 'approaching', 'gravel', 'hillside'] 2022-03-16 04:54:21,136.136 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:42:56 iter: 300 speed: 315.0 images/sec total_norm: 137.6106 (141.0283) loss: 183.0483 (182.1217) masked_loss: 3.8971 (3.9055) tag_loss: 179.2304 (178.2163) time: 1.4334 (1.6253) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4282 (1.6202) lr: 0.000100 max mem: 26307 2022-03-16 04:54:21,499.499 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.3142857253551483 2022-03-16 04:54:21,500.500 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 183.96987915039062 2022-03-16 04:54:21,500.500 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.22936820983887 2022-03-16 04:54:24,058.058 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.009669368155300617 2022-03-16 04:54:24,059.059 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 04:54:24,059.059 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'little', 'boy', 'playing', 'baseball', '[MASK]', '[MASK]', 'swing', 'his', 'bat', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 04:54:24,074.074 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'helmet', 'fence', 'line', 'man', '[UNK]', 'person', 'grass', 'shoe', 'game', 'catcher', 'field', 'bat', 'dirt', 'baseball', 'short', 'boy', 'umpire', 'woman', 'hand', 'ground', 'glove', 'pole', 'mask', 'hat', 'batter', 'plate', 'sign', 'ball', 'player', 'uniform', 'tree', 'leg', 'cap', 'shadow', 'sky', 'jean', 'jersey', 'head', 'girl', 'belt', 'camera', 'hair', 'jacket', 'pad', 'bag', 'sunglasses', 'sock', 'bench', 'child'] 2022-03-16 04:54:40,041.041 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'game', 'little', 'line', 'player', 'woman', 'short', 'field', 'person', 'boy', 'baseball', 'sign', 'shirt', 'grass', 'hat', 'cap', 'uniform', 'pole', 'dirt', 'bat', 'mask', 'fence', 'helmet', 'shoe', 'catcher', 'glove', 'umpire'] 2022-03-16 04:57:03,554.554 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:57:15 iter: 400 speed: 315.2 images/sec total_norm: 135.6639 (141.4707) loss: 183.5033 (182.2539) masked_loss: 3.6852 (3.6802) tag_loss: 179.7325 (178.5737) time: 1.4345 (1.6242) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4295 (1.6191) lr: 0.000099 max mem: 26307 2022-03-16 04:57:03,915.915 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.15625 2022-03-16 04:57:03,915.915 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 169.03848266601562 2022-03-16 04:57:03,916.916 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.91149139404297 2022-03-16 04:57:06,492.492 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.010557741858065128 2022-03-16 04:57:06,492.492 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 04:57:06,492.492 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'some', 'old', '[MASK]', 'go', 'for', 'drinks', 'at', 'a', 'pub', '##bery', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 04:57:06,508.508 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hat', 'glass', 'woman', 'person', 'table', 'man', 'hair', 'wall', 'shirt', 'light', 'head', 'cap', 'jacket', 'ceiling', 'glasses', 'wine', 'archway', 'cup', '[UNK]', 'hand', 'face', 'bottle', 'bowl', 'group', 'arch', 'plate', 'window', 'sweater', 'building', 'picture', 'chair', 'food', 'pitcher', 'room', 'jean', 'candle', 'bar', 'coat', 'lamp', 'napkin', 'ear', 'purse', 'sign', 'suit', 'vase', 'lady', 'basket', 'paper', 'door', 'water'] 2022-03-16 04:57:22,568.568 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'number', 'old', 'room', 'building', 'light', 'woman', 'hair', 'person', 'table', 'wall', 'glass', 'paper', 'sign', 'jean', 'shirt', 'picture', 'wine', 'speaker', 'ceiling', 'hat', 'cap', 'jacket', 'pen', 'glasses', 'pitcher', 'pub', 'lid', 'vase', 'napkin', 'archway'] 2022-03-16 04:59:45,987.987 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:04:43 iter: 500 speed: 315.2 images/sec total_norm: 133.8038 (137.0907) loss: 180.4330 (180.3936) masked_loss: 3.6690 (3.7253) tag_loss: 176.5965 (176.6683) time: 1.4343 (1.6244) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4290 (1.6192) lr: 0.000099 max mem: 26307 2022-03-16 04:59:46,348.348 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.42105263471603394 2022-03-16 04:59:46,349.349 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 178.7808074951172 2022-03-16 04:59:46,349.349 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.73734283447266 2022-03-16 04:59:48,973.973 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.010883732698857784 2022-03-16 04:59:48,974.974 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 04:59:48,974.974 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'outside', 'seen', 'of', 'many', 'colorful', 'open', '[MASK]', '##s', 'with', 'a', 'huge', 'multi', 'colored', 'umbrella', '[MASK]', 'in', 'front', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 04:59:48,989.989 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['umbrella', 'person', 'woman', 'man', 'building', 'shirt', 'crowd', 'tree', 'hat', 'hair', '[UNK]', 'head', 'beach', 'sunglasses', 'pole', 'bag', 'sky', 'chair', 'stripe', 'short', 'background', 'top', 'purse', 'roof', 'tent', 'arm', 'market', 'towel', 'girl', 'hand', 'wall', 'sign', 'dress', 'water', 'house', 'jacket', 'flag', 'cap', 'grass', 'large', 'face', 'fence', 'window', 'child', 'table', 'next', 'lady', 'glasses', 'backpack', 'couple'] 2022-03-16 05:00:04,956.956 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['many', 'head', 'man', 'building', 'open', 'front', 'woman', 'hair', 'person', 'boy', 'beach', 'shirt', 'huge', 'crowd', 'multi', 'hat', 'tent', 'umbrella', 'colorful'] 2022-03-16 05:02:28,541.541 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:08:58 iter: 600 speed: 315.0 images/sec total_norm: 131.7876 (134.7105) loss: 176.1494 (175.0405) masked_loss: 3.4661 (3.4510) tag_loss: 172.4846 (171.5895) time: 1.4347 (1.6255) data: 0.0001 (0.0001) to_device: 0.0052 (0.0051) time_gpu: 1.4293 (1.6202) lr: 0.000099 max mem: 26307 2022-03-16 05:02:28,901.901 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.39393940567970276 2022-03-16 05:02:28,902.902 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 157.16656494140625 2022-03-16 05:02:28,902.902 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.56050981794085 2022-03-16 05:02:31,550.550 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.011422410607337952 2022-03-16 05:02:31,550.550 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 05:02:31,550.550 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'batter', 'musique', 'catcher', 'and', '[MASK]', 'in', 'a', 'baseball', 'game', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 05:02:31,565.565 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['man', 'shirt', '[UNK]', 'bat', 'helmet', 'shoe', 'field', 'person', 'player', 'baseball', 'roof', 'grass', 'game', 'sky', 'batter', 'glove', 'fence', 'dirt', 'catcher', 'uniform', 'building', 'hat', 'belt', 'leg', 'line', 'hand', 'pole', 'jacket', 'plate', 'tree', 'mask', 'stadium', 'ball', 'ground', 'umpire', 'home', 'bench', 'jersey', 'shadow', 'head', 'sock', 'camera', 'short', 'cap', 'woman', 'window', 'arm', 'sign', 'crowd', 'wall'] 2022-03-16 05:02:47,498.498 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'game', 'building', 'player', 'field', 'ground', 'person', 'stand', 'baseball', 'shirt', 'jersey', 'leg', 'roof', 'plate', 'grass', 'belt', 'hat', 'cap', 'uniform', 'jacket', 'dirt', 'bat', 'mask', 'helmet', 'shoe', 'catcher', 'glove', 'umpire', 'spectator', 'batter'] 2022-03-16 05:05:11,492.492 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:11:51 iter: 700 speed: 314.2 images/sec total_norm: 133.7510 (140.6465) loss: 176.2631 (177.3342) masked_loss: 3.2643 (3.2633) tag_loss: 172.8288 (174.0709) time: 1.4354 (1.6295) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4305 (1.6245) lr: 0.000099 max mem: 26307 2022-03-16 05:05:11,853.853 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.38235294818878174 2022-03-16 05:05:11,853.853 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 160.90536499023438 2022-03-16 05:05:11,853.853 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.29880237579346 2022-03-16 05:05:14,540.540 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01174083910882473 2022-03-16 05:05:14,541.541 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 05:05:14,541.541 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'group', 'of', 'children', '[MASK]', 'around', 'a', 'table', 'with', 'a', 'large', 'cake', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 05:05:14,556.556 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['table', 'shirt', 'hair', 'cake', 'plate', 'man', '[UNK]', 'child', 'glass', 'woman', 'boy', 'person', 'head', 'girl', 'hat', 'hand', 'bowl', 'wall', 'window', 'sweater', 'cup', 'fork', 'knife', 'food', 'flower', 'beard', 'tray', 'chair', 'napkin', 'group', 'face', 'container', 'candle', 'baby', 'wine', 'picture', 'necklace', 'lamp', 'spoon', 'cap', 'family', 'glasses', 'kid', 'jacket', 'bag', 'lid', 'dish', 'ear', 'dessert', 'jean'] 2022-03-16 05:05:30,610.610 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'group', 'hand', 'large', 'door', 'woman', 'cup', 'hair', 'girl', 'person', 'child', 'table', 'wall', 'food', 'boy', 'shirt', 'handle', 'plate', 'knife', 'hat', 'cap', 'flower', 'glasses', 'fork', 'cake', 'beard', 'lamp', 'spoon'] 2022-03-16 05:07:54,307.307 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:13:08 iter: 800 speed: 314.5 images/sec total_norm: 134.2146 (138.4319) loss: 170.5890 (174.6861) masked_loss: 3.0440 (3.1001) tag_loss: 168.2145 (171.5861) time: 1.4348 (1.6282) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4299 (1.6230) lr: 0.000099 max mem: 26307 2022-03-16 05:07:54,667.667 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.42424243688583374 2022-03-16 05:07:54,667.667 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 178.43075561523438 2022-03-16 05:07:54,668.668 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.56159040662978 2022-03-16 05:07:57,420.420 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.012304706498980522 2022-03-16 05:07:57,420.420 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 05:07:57,420.420 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'man', 'is', 'flying', 'a', 'kite', 'outside', '[MASK]', 'with', 'others', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 05:07:57,435.435 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'sky', 'cloud', 'grass', '[UNK]', 'park', 'field', 'person', 'head', 'man', 'building', 'leg', 'pole', 'ground', 'trunk', 'branch', 'bush', 'fence', 'tail', 'jacket', 'shadow', 'shirt', 'road', 'bench', 'post', 'roof', 'shoe', 'sign', 'hair', 'kite', 'house', 'next', 'wheel', 'hand', 'ear', 'car', 'face', 'horse', 'hat', 'street', 'jean', 'rock', 'front', 'light', 'hill', 'dirt', 'wall', 'window', 'woman', 'sidewalk'] 2022-03-16 05:08:13,444.444 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'park', 'hair', 'outside', 'person', 'child', 'arm', 'boy', 'tree', 'wood', 'sky', 'shirt', 'leg', 'shadow', 'grass', 'bush', 'hat', 'cloud', 'jacket', 'sweater', 'kite', 'stump'] 2022-03-16 05:10:37,139.139 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:13:32 iter: 900 speed: 314.4 images/sec total_norm: 137.5071 (138.8992) loss: 174.5231 (173.8569) masked_loss: 3.0852 (3.0507) tag_loss: 172.3848 (170.8062) time: 1.4357 (1.6283) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4305 (1.6232) lr: 0.000099 max mem: 26307 2022-03-16 05:10:37,504.504 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.3030303120613098 2022-03-16 05:10:37,505.505 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 194.32888793945312 2022-03-16 05:10:37,505.505 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.19998245239258 2022-03-16 05:10:40,240.240 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.012654680758714676 2022-03-16 05:10:40,240.240 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 05:10:40,240.240 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'group', '[MASK]', 'surf', '##board', 'are', 'stacked', 'in', 'a', 'shed', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 05:10:40,256.256 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['building', 'window', 'roof', 'sky', 'tree', '[UNK]', 'sign', 'street', 'person', 'shirt', 'house', 'wall', 'man', 'sidewalk', 'woman', 'road', 'hair', 'door', 'ground', 'bag', 'store', 'wire', 'head', 'motorcycle', 'pole', 'umbrella', 'bike', 'shop', 'wheel', 'shoe', 'plant', 'light', 'tire', 'hand', 'hat', 'jacket', 'bicycle', 'short', 'skirt', 'trash', 'line', 'basket', 'dress', 'car', 'leg', 'cart', 'clothes', 'chair', 'jean', 'boy'] 2022-03-16 05:10:56,196.196 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'group', 'hand', 'face', 'building', 'woman', 'short', 'ground', 'board', 'hair', 'person', 'table', 'arm', 'boy', 'window', 'tree', 'sign', 'sky', 'shirt', 'roof', 'bag', 'flag', 'bottle', 'bin', 'brush', 'shed', 'sidewalk', 'jug', 'crate'] 03-16 05:12:19.019 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 05:12:19.019 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 05:12:20.144 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 05:13:19,843.843 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:13:10 iter: 1000 speed: 314.7 images/sec total_norm: 132.1233 (136.8789) loss: 173.8704 (174.4273) masked_loss: 2.8622 (2.8671) tag_loss: 170.4219 (171.5602) time: 1.4341 (1.6270) data: 0.0001 (0.0001) to_device: 0.0051 (0.0051) time_gpu: 1.4289 (1.6218) lr: 0.000098 max mem: 26307 2022-03-16 05:13:20,204.204 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4848484992980957 2022-03-16 05:13:20,205.205 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 164.07948303222656 2022-03-16 05:13:20,205.205 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.00378071178089 2022-03-16 05:13:22,982.982 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.012677717953920364 2022-03-16 05:13:22,983.983 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 05:13:22,983.983 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'and', 'orange', 'bus', '相', 'street', 'next', 'to', 'trees', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 05:13:22,998.998 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['window', 'sky', 'bus', 'building', 'road', 'street', 'light', 'tire', 'pole', '[UNK]', 'windshield', 'tree', 'sign', 'wheel', 'stripe', 'line', 'roof', 'man', 'door', 'mirror', 'plate', 'sidewalk', 'front', 'car', 'logo', 'person', 'curb', 'license', 'bumper', 'driver', 'wall', 'letter', 'vest', 'number', 'traffic', 'fence', 'chimney', 'truck', 'van', 'jacket', 'writing', 'grill', 'cone', 'white', 'hat', 'helmet', 'arrow', 'back', 'city', 'shoe'] 2022-03-16 05:13:39,004.004 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'number', 'next', 'white', 'road', 'front', 'street', 'light', 'window', 'tree', 'sign', 'sky', 'bus', 'roof', 'orange', 'wheel', 'mirror', 'pole', 'fence', 'sidewalk', 'tire', 'advertisement', 'stripe', 'vent', 'windshield', 'bumper'] 2022-03-16 05:16:02,734.734 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:12:34 iter: 1100 speed: 314.3 images/sec total_norm: 129.3981 (134.6015) loss: 172.7981 (174.8679) masked_loss: 2.8405 (2.8324) tag_loss: 170.2540 (172.0356) time: 1.4360 (1.6289) data: 0.0001 (0.0005) to_device: 0.0050 (0.0048) time_gpu: 1.4309 (1.6236) lr: 0.000098 max mem: 26307 2022-03-16 05:16:03,095.095 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.23529411852359772 2022-03-16 05:16:03,095.095 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 180.63967895507812 2022-03-16 05:16:03,095.095 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.03975868225098 2022-03-16 05:16:05,908.908 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.012720104306936264 2022-03-16 05:16:05,909.909 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 05:16:05,909.909 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'professional', 'pitcher', 'on', 'the', 'mound', 'getting', '[MASK]', 'to', '[MASK]', 'the', 'ball', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 05:16:05,924.924 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['grass', 'belt', 'field', 'jersey', 'baseball', 'uniform', '[UNK]', 'glove', 'shirt', 'shoe', 'number', 'man', 'dirt', 'leg', 'head', 'player', 'hand', 'hat', 'fence', 'ball', 'cap', 'arm', 'ground', 'name', 'wall', 'helmet', 'back', 'face', 'logo', 'line', 'home', 'pole', 'person', 'mound', 'plate', 'base', 'stripe', 'letter', 'sign', 'hair', 'sock', 'tree', 'bat', 'game', 'shadow', 'pitcher', 'sleeve', 'shin', 'pitch', 'ear'] 2022-03-16 05:16:21,907.907 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'back', 'head', 'man', 'name', 'hand', 'number', 'line', 'player', 'field', 'ground', 'professional', 'person', 'arm', 'ready', 'baseball', 'ball', 'letter', 'shirt', 'jersey', 'leg', 'bag', 'grass', 'belt', 'cap', 'uniform', 'dirt', 'pitcher', 'fence', 'shoe', 'mound', 'cooler', 'glove', 'sunglasses'] 2022-03-16 05:18:45,558.558 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:11:33 iter: 1200 speed: 314.5 images/sec total_norm: 132.7703 (136.9227) loss: 172.2552 (172.7141) masked_loss: 2.6881 (2.7103) tag_loss: 169.2189 (170.0038) time: 1.4346 (1.6283) data: 0.0001 (0.0002) to_device: 0.0049 (0.0048) time_gpu: 1.4297 (1.6233) lr: 0.000098 max mem: 26307 2022-03-16 05:18:45,924.924 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4000000059604645 2022-03-16 05:18:45,924.924 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 181.54107666015625 2022-03-16 05:18:45,924.924 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 67.81044123722957 2022-03-16 05:18:48,783.783 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01294653583317995 2022-03-16 05:18:48,783.783 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 05:18:48,784.784 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'this', 'is', 'an', 'image', 'of', '[MASK]', '[MASK]', 'in', 'a', 'motorcycle', 'side', 'car', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 05:18:48,799.799 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['dog', 'ear', 'head', 'motorcycle', 'car', '[UNK]', 'bike', 'tire', 'man', 'seat', 'window', 'sign', 'eye', 'rack', 'building', 'mirror', 'light', 'wheel', 'wall', 'shirt', 'door', 'person', 'collar', 'pole', 'windshield', 'hand', 'handle', 'nose', 'face', 'truck', 'hair', 'jacket', 'hood', 'mouth', 'bar', 'tag', 'shadow', 'fender', 'sky', 'tree', 'chain', 'paw', 'plate', 'vehicle', 'woman', 'roof', 'jean', 'ground', 'leg', 'reflection'] 2022-03-16 05:19:04,876.876 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'side', 'building', 'car', 'mouth', 'wall', 'seat', 'writing', 'eye', 'window', 'letter', 'sign', 'image', 'gas', 'dog', 'nose', 'ear', 'tank', 'handle', 'mirror', 'pole', 'hood', 'bike', 'logo', 'pipe', 'motorcycle', 'tire', 'pillar', 'exhaust', 'fender', 'windshield'] 2022-03-16 05:21:28,633.633 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:10:28 iter: 1300 speed: 314.0 images/sec total_norm: 133.8144 (138.2303) loss: 173.9276 (173.6177) masked_loss: 2.7259 (2.7298) tag_loss: 170.3922 (170.8880) time: 1.4365 (1.6306) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4315 (1.6256) lr: 0.000098 max mem: 26307 2022-03-16 05:21:28,993.993 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4166666567325592 2022-03-16 05:21:28,993.993 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 177.44674682617188 2022-03-16 05:21:28,994.994 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 67.68502698625836 2022-03-16 05:21:31,923.923 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.013217877596616745 2022-03-16 05:21:31,924.924 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 05:21:31,924.924 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'lady', 'bent', '[MASK]', 'with', '[MASK]', 'tennis', 'rack', '##et', 'while', 'another', 'girl', 'looks', 'down', 'court', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 05:21:31,940.940 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'tennis', 'court', 'woman', 'shoe', 'shirt', 'short', 'hand', 'leg', 'hair', 'line', 'head', 'player', 'wall', 'letter', 'arm', 'logo', 'ground', 'band', 'skirt', 'top', 'handle', 'girl', 'outfit', 'ball', 'tank', 'ponytail', 'sock', 'person', 'face', 'necklace', 'curtain', 'sign', 'dress', 'man', 'banner', 'string', 'mouth', 'hat', 'stand', 'female', 'cap', 'advertisement', 'ear', 'wrist', 'shadow', 'uniform', 'stripe', 'knee', 'spectator'] 2022-03-16 05:21:47,909.909 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'hand', 'line', 'band', 'top', 'player', 'woman', 'court', 'short', 'hair', 'girl', 'mouth', 'wall', 'arm', 'lady', 'letter', 'shirt', 'leg', 'ear', 'tank', 'handle', 'tennis', 'bent', 'skirt', 'shoe', 'outfit', 'ponytail'] 2022-03-16 05:24:11,593.593 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:09:05 iter: 1400 speed: 314.2 images/sec total_norm: 127.5011 (130.7926) loss: 171.3775 (172.3323) masked_loss: 2.6194 (2.6451) tag_loss: 168.6091 (169.6872) time: 1.4354 (1.6297) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4301 (1.6245) lr: 0.000098 max mem: 26307 2022-03-16 05:24:11,954.954 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.3333333432674408 2022-03-16 05:24:11,954.954 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 202.48736572265625 2022-03-16 05:24:11,954.954 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 67.4316192626953 2022-03-16 05:24:14,878.878 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.013179749250411987 2022-03-16 05:24:14,878.878 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 05:24:14,879.879 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'kitchen', 'area', 'is', 'clean', 'and', 'ready', 'to', 'use', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 05:24:14,894.894 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['chair', 'wall', 'window', 'table', 'television', 'floor', 'room', 'shelf', 'drawer', '[UNK]', 'cushion', 'book', 'cabinet', 'microwave', 'ceiling', 'door', 'handle', 'picture', 'kitchen', 'desk', 'light', 'leg', 'rug', 'building', 'knob', 'stove', 'stool', 'top', 'couch', 'oven', 'box', 'bed', 'monitor', 'coffee', 'can', 'seat', 'lamp', 'living', 'carpet', 'bottle', 'bowl', 'pot', 'pillow', 'basket', 'screen', 'dresser', 'blind', 'lid', 'fireplace', 'paper'] 2022-03-16 05:24:30,829.829 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'can', 'area', 'room', 'light', 'television', 'design', 'floor', 'bed', 'table', 'wall', 'ready', 'glass', 'chair', 'paper', 'window', 'kitchen', 'leg', 'clean', 'handle', 'cabinet', 'bottle', 'ceiling', 'blind', 'pot', 'towel', 'shelf', 'trash', 'lid', 'garbage', 'drawer', 'tile', 'stove', 'knob', 'oven', 'microwave', 'rug', 'cushion'] 2022-03-16 05:26:54,578.578 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:07:32 iter: 1500 speed: 314.1 images/sec total_norm: 128.0446 (131.7554) loss: 171.2310 (171.7803) masked_loss: 2.4614 (2.5328) tag_loss: 168.9170 (169.2475) time: 1.4349 (1.6299) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4299 (1.6248) lr: 0.000098 max mem: 26307 2022-03-16 05:26:54,941.941 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4848484992980957 2022-03-16 05:26:54,941.941 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 153.58511352539062 2022-03-16 05:26:54,941.941 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 67.49695444107056 2022-03-16 05:26:57,910.910 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.013360531069338322 2022-03-16 05:26:57,910.910 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 05:26:57,911.911 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'this', 'double', '[MASK]', 'bus', '[MASK]', 'along', 'the', 'streets', 'in', 'madrid', ',', 'spain', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 05:26:57,926.926 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['window', 'bus', 'building', 'street', 'road', 'tire', 'line', 'wheel', 'sign', 'balcony', 'car', '[UNK]', 'advertisement', 'person', 'sidewalk', 'door', 'windshield', 'light', 'front', 'license', 'plate', 'letter', 'decker', 'logo', 'city', 'woman', 'man', 'pole', 'sky', 'driver', 'double', 'curb', 'deck', 'mirror', 'number', 'top', 'railing', 'ad', 'passenger', 'word', 'hair', 'shirt', 'red', 'tree', 'traffic', 'van', 'fence', 'wall', 'picture', 'back'] 2022-03-16 05:27:13,901.901 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'back', 'line', 'building', 'road', 'street', 'light', 'car', 'person', 'double', 'window', 'sign', 'bus', 'plate', 'wheel', 'license', 'balcony', 'tire', 'railing', 'vent', 'decker'] 2022-03-16 05:29:37,590.590 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:05:51 iter: 1600 speed: 314.1 images/sec total_norm: 129.1474 (132.6272) loss: 175.4478 (172.9883) masked_loss: 2.5299 (2.5677) tag_loss: 172.7031 (170.4205) time: 1.4346 (1.6301) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4294 (1.6249) lr: 0.000098 max mem: 26307 2022-03-16 05:29:37,952.952 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5454545617103577 2022-03-16 05:29:37,952.952 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 171.15374755859375 2022-03-16 05:29:37,952.952 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 67.51970717486213 2022-03-16 05:29:40,973.973 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.013498584739863873 2022-03-16 05:29:40,973.973 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 05:29:40,974.974 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'gi', '##raf', '##fe', '##s', 'standing', '[MASK]', 'the', 'side', 'of', 'a', 'grassy', 'hill', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 05:29:40,989.989 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'neck', 'head', 'grass', 'leg', 'field', 'tree', 'tail', 'bush', 'branch', 'ear', 'mane', 'ground', 'trunk', 'face', 'spot', 'wild', 'horn', 'hair', 'couple', 'zebra', 'group', 'grassy', 'shadow', 'next', 'bird', 'body', 'other', 'background', 'animal', 'sky', 'area', 'standing', 'tall', 'baby', 'brush', 'large', 'stripe', 'plain', 'small', 'dirt', 'lush', 'grazing', 'open', 'mouth', 'deer', 'top', 'rock', 'brown', 'stick'] 2022-03-16 05:29:56,977.977 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'side', 'field', 'ground', 'hill', 'neck', 'tree', 'branch', 'spot', 'leg', 'grass', 'tail', 'bush', 'grassy', 'mane'] 2022-03-16 05:32:21,062.062 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:04:20 iter: 1700 speed: 313.2 images/sec total_norm: 130.4102 (133.6316) loss: 171.8849 (171.3629) masked_loss: 2.4974 (2.4935) tag_loss: 169.2585 (168.8694) time: 1.4351 (1.6347) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4304 (1.6296) lr: 0.000097 max mem: 26307 2022-03-16 05:32:21,424.424 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5428571701049805 2022-03-16 05:32:21,424.424 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 173.7703857421875 2022-03-16 05:32:21,424.424 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 67.63988071017795 2022-03-16 05:32:24,476.476 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.013531708158552647 2022-03-16 05:32:24,476.476 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 05:32:24,477.477 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'shirt', 'less', 'man', 'with', 'tennis', 'rack', '##et', 'walking', 'on', 'a', '[MASK]', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 05:32:24,492.492 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['man', '[UNK]', 'tennis', 'hand', 'short', 'hat', 'leg', 'wall', 'court', 'head', 'cap', 'sock', 'shoe', 'fence', 'shadow', 'logo', 'sign', 'arm', 'ground', 'banner', 'letter', 'line', 'band', 'ball', 'face', 'handle', 'wrist', 'hair', 'advertisement', 'player', 'ear', 'back', 'tattoo', 'stripe', 'background', 'top', 'shirt', 'person', 'net', 'foot', 'spectator', 'sunglasses', 'pole', 'mouth', 'chair', 'watch', 'bracelet', 'string', 'board', 'knee'] 2022-03-16 05:32:40,513.513 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'hand', 'court', 'short', 'wall', 'arm', 'sign', 'leg', 'ear', 'tennis', 'net', 'hat', 'cap', 'pole', 'wrist', 'logo', 'fence', 'banner', 'bracelet', 'sock'] 2022-03-16 05:35:04,179.179 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:02:28 iter: 1800 speed: 313.9 images/sec total_norm: 129.6942 (132.7102) loss: 167.9482 (166.3510) masked_loss: 2.4083 (2.4409) tag_loss: 165.8114 (163.9101) time: 1.4350 (1.6312) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4297 (1.6260) lr: 0.000097 max mem: 26307 2022-03-16 05:35:04,539.539 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4000000059604645 2022-03-16 05:35:04,539.539 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 153.75851440429688 2022-03-16 05:35:04,540.540 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 67.99534807707134 2022-03-16 05:35:07,619.619 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.013688676990568638 2022-03-16 05:35:07,619.619 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 05:35:07,619.619 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'young', 'man', 'sits', 'on', '[MASK]', 'bench', '[MASK]', 'the', 'snow', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 05:35:07,634.634 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'bench', 'snow', 'building', 'tree', 'ground', 'sidewalk', 'light', 'shadow', 'car', 'pole', 'man', 'road', 'house', 'street', 'roof', 'hat', 'jacket', 'window', '[UNK]', 'head', 'coat', 'leg', 'person', 'path', 'curb', 'hand', 'lamp', 'shoe', 'grass', 'hair', 'chimney', 'bush', 'arm', 'park', 'post', 'line', 'can', 'cap', 'trunk', 'back', 'sign', 'bag', 'branch', 'foot', 'woman', 'puddle', 'trash', 'town', 'truck'] 2022-03-16 05:35:23,597.597 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'house', 'building', 'road', 'street', 'young', 'light', 'car', 'ground', 'window', 'tree', 'sky', 'leg', 'roof', 'snow', 'shadow', 'grass', 'hat', 'jacket', 'bench', 'porch', 'sidewalk'] 2022-03-16 05:37:47,396.396 2829:trainer.py:487 do_train_dict(): eta: 1 day, 5:00:35 iter: 1900 speed: 313.7 images/sec total_norm: 127.9392 (128.4986) loss: 168.6987 (169.7907) masked_loss: 2.3185 (2.4161) tag_loss: 166.4492 (167.3746) time: 1.4363 (1.6322) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4312 (1.6271) lr: 0.000097 max mem: 26307 2022-03-16 05:37:47,759.759 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6363636255264282 2022-03-16 05:37:47,760.760 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 155.89642333984375 2022-03-16 05:37:47,760.760 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 67.93155250549316 2022-03-16 05:37:50,876.876 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.013802473433315754 2022-03-16 05:37:50,876.876 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 05:37:50,876.876 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'an', 'old', 'fashioned', 'train', 'engine', 'parked', 'with', '[MASK]', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 05:37:50,892.892 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['train', 'ground', 'building', 'roof', 'sky', 'window', 'track', 'man', '[UNK]', 'light', 'smoke', 'door', 'pole', 'person', 'shirt', 'tree', 'bumper', 'station', 'chimney', 'car', 'wall', 'sign', 'engine', 'vent', 'front', 'barn', 'hat', 'shoe', 'steam', 'jean', 'wheel', 'wood', 'fence', 'jacket', 'gravel', 'head', 'pipe', 'table', 'stack', 'bench', 'number', 'hair', 'boy', 'bag', 'child', 'platform', 'box', 'coat', 'cloud', 'black'] 2022-03-16 05:38:06,903.903 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'american', 'family', 'man', 'group', 'old', 'building', 'front', 'light', 'woman', 'ground', 'hair', 'person', 'table', 'boy', 'engine', 'paper', 'window', 'train', 'tree', 'sky', 'shirt', 'roof', 'tank', 'flag', 'smoke', 'jacket', 'bench', 'shed', 'barn', 'stack', 'ladder', 'picnic', 'tire', 'fashioned', 'vent'] 2022-03-16 05:40:30,491.491 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:58:32 iter: 2000 speed: 313.9 images/sec total_norm: 128.3105 (131.0135) loss: 166.7830 (165.3335) masked_loss: 2.3378 (2.3857) tag_loss: 164.6639 (162.9477) time: 1.4346 (1.6310) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4294 (1.6260) lr: 0.000097 max mem: 26307 2022-03-16 05:40:30,853.853 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5428571701049805 2022-03-16 05:40:30,853.853 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 150.54660034179688 2022-03-16 05:40:30,853.853 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 67.90539042154948 2022-03-16 05:40:34,006.006 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.013828023336827755 2022-03-16 05:40:34,007.007 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 05:40:34,007.007 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'on', 'a', 'boat', '[MASK]', 'at', 'the', 'water', 'and', 'the', 'sun', '[MASK]', 'on', 'the', 'water', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 05:40:34,022.022 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['dog', 'water', 'sky', 'collar', 'head', 'ear', 'neck', 'land', 'sun', 'boat', 'tree', 'button', 'lake', 'reflection', 'hill', '[UNK]', 'body', 'mountain', 'nose', 'light', 'eye', 'leg', 'rope', 'view', 'window', 'back', 'tail', 'harness', 'wave', 'pole', 'arm', 'paw', 'face', 'cloud', 'snout', 'buckle', 'hair', 'beach', 'belt', 'spot', 'mouth', 'white', 'seat', 'background', 'car', 'shadow', 'person', 'bar', 'bolt', 'leash'] 2022-03-16 05:40:49,996.996 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'water', 'light', 'hill', 'sun', 'neck', 'tree', 'sky', 'dog', 'boat', 'bell', 'nose', 'button', 'trunk', 'elbow', 'collar', 'harness', 'buckle'] 03-16 05:42:20.245 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 05:42:20.245 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 05:42:21.210 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}] 2022-03-16 05:43:13,797.797 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:56:32 iter: 2100 speed: 313.5 images/sec total_norm: 127.0578 (128.6500) loss: 167.9351 (168.6626) masked_loss: 2.3923 (2.4091) tag_loss: 165.2934 (166.2535) time: 1.4350 (1.6331) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4298 (1.6279) lr: 0.000097 max mem: 26307 2022-03-16 05:43:14,159.159 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.3125 2022-03-16 05:43:14,159.159 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 150.48577880859375 2022-03-16 05:43:14,159.159 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.12507594715466 2022-03-16 05:43:17,299.299 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.013921589590609074 2022-03-16 05:43:17,299.299 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 05:43:17,299.299 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'elephant', 'in', 'the', 'zoo', 'is', 'pushing', 'against', 'the', 'tree', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 05:43:17,314.314 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['grass', 'ground', 'rock', 'leg', 'elephant', 'wall', 'zoo', 'ear', 'trunk', 'tree', 'enclosure', 'head', 'shadow', 'tail', 'bush', '[UNK]', 'foot', 'log', 'plant', 'dirt', 'weed', 'boulder', 'water', 'stump', 'eye', 'road', 'next', 'mouth', 'fence', 'stone', 'standing', 'large', 'hole', 'stick', 'area', 'block', 'pole', 'other', 'branch', 'animal', 'body', 'back', 'baby', 'hill', 'leaf', 'wire', 'sky', 'post', 'paw', 'bear'] 2022-03-16 05:43:33,356.356 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'water', 'ground', 'rock', 'wall', 'eye', 'tree', 'leg', 'ear', 'shadow', 'grass', 'tail', 'trunk', 'zoo', 'elephant', 'enclosure', 'stump'] 2022-03-16 05:45:57,163.163 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:54:30 iter: 2200 speed: 313.4 images/sec total_norm: 124.6061 (126.4061) loss: 165.6890 (167.8841) masked_loss: 2.3442 (2.3474) tag_loss: 163.9249 (165.5366) time: 1.4349 (1.6336) data: 0.0001 (0.0005) to_device: 0.0050 (0.0049) time_gpu: 1.4297 (1.6282) lr: 0.000097 max mem: 26307 2022-03-16 05:45:57,527.527 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4545454680919647 2022-03-16 05:45:57,527.527 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 177.9755401611328 2022-03-16 05:45:57,527.527 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 67.95332668138587 2022-03-16 05:46:00,698.698 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.013937680050730705 2022-03-16 05:46:00,698.698 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 05:46:00,699.699 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'gi', '##raf', '##fe', 'in', 'the', 'middle', 'of', 'a', 'small', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 05:46:00,714.714 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'sky', 'ground', '[UNK]', 'leg', 'head', 'neck', 'dirt', 'forest', 'branch', 'fence', 'bush', 'ear', 'stick', 'tail', 'log', 'rock', 'trunk', 'horn', 'wood', 'leaf', 'next', 'pole', 'eye', 'stump', 'field', 'spot', 'area', 'post', 'plant', 'grass', 'hill', 'face', 'animal', 'mountain', 'tall', 'standing', 'bird', 'zoo', 'large', 'hair', 'body', 'mane', 'mouth', 'brown', 'grassy', 'enclosure', 'small', 'walking', 'group'] 2022-03-16 05:46:16,769.769 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'small', 'ground', 'middle', 'post', 'forest', 'neck', 'tree', 'wood', 'branch', 'sky', 'leg', 'bush', 'stick', 'dirt', 'elephant'] 2022-03-16 05:48:40,316.316 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:52:18 iter: 2300 speed: 313.8 images/sec total_norm: 125.1410 (128.4399) loss: 166.4142 (167.0106) masked_loss: 2.3073 (2.3865) tag_loss: 163.6856 (164.6240) time: 1.4332 (1.6315) data: 0.0001 (0.0001) to_device: 0.0052 (0.0051) time_gpu: 1.4279 (1.6262) lr: 0.000097 max mem: 26307 2022-03-16 05:48:40,677.677 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4848484992980957 2022-03-16 05:48:40,677.677 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 189.9331512451172 2022-03-16 05:48:40,677.677 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 67.95998859405518 2022-03-16 05:48:43,896.896 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.013976288959383965 2022-03-16 05:48:43,896.896 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 05:48:43,897.897 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'skate', '##board', '##er', 'performing', 'a', 'stunt', '##chel', 'an', '[MASK]', 'area', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 05:48:43,912.912 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'man', 'building', 'step', 'shirt', 'stair', 'shoe', 'car', 'pole', 'sign', 'person', 'sidewalk', 'tree', 'street', 'boy', 'ground', 'light', 'van', 'hair', 'sky', 'hand', 'bench', 'arm', 'jean', 'wall', 'head', 'window', 'hat', 'railing', 'wheel', 'truck', 'rail', 'city', 'trick', 'road', 'leg', 'bag', 'jacket', 'park', 'lot', 'can', 'line', 'post', 'skate', 'tire', 'bicycle', 'ramp', 'board', 'pillar', 'roof'] 2022-03-16 05:48:59,960.960 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'can', 'city', 'man', 'area', 'building', 'road', 'street', 'light', 'car', 'ground', 'wall', 'boy', 'van', 'window', 'step', 'sign', 'shirt', 'bus', 'urban', 'traffic', 'bag', 'truck', 'hat', 'pole', 'bench', 'fence', 'reflection', 'shoe', 'sidewalk', 'tire', 'umbrella', 'pillar', 'stunt', 'railing', 'stair'] 2022-03-16 05:51:23,562.562 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:50:07 iter: 2400 speed: 313.6 images/sec total_norm: 133.2892 (136.3614) loss: 168.5715 (168.7346) masked_loss: 2.3544 (2.3869) tag_loss: 166.8105 (166.3478) time: 1.4354 (1.6325) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4296 (1.6272) lr: 0.000096 max mem: 26307 2022-03-16 05:51:23,929.929 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5 2022-03-16 05:51:23,929.929 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 135.15872192382812 2022-03-16 05:51:23,929.929 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.13902069091797 2022-03-16 05:51:27,179.179 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.014039664529263973 2022-03-16 05:51:27,179.179 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 05:51:27,179.179 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'snow', '##board', '##er', 'with', 'glasses', 'is', 'flying', 'through', '[MASK]', 'air', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 05:51:27,195.195 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'jacket', 'sky', 'glove', 'head', 'man', 'cloud', 'boot', 'leg', 'arm', 'helmet', 'hand', 'zipper', 'face', 'ski', 'coat', 'person', 'foot', 'snow', 'skier', 'board', 'air', 'pole', 'shoe', 'tree', 'ground', 'top', 'slope', 'logo', 'suit', 'knee', 'hill', 'black', 'mountain', 'stripe', 'hat', 'snowy', 'glasses', 'blue', 'yellow', 'strap', 'hood', 'design', 'white', 'jump', 'pine', 'hair', 'clothes', 'half', 'building'] 2022-03-16 05:51:43,151.151 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'air', 'person', 'arm', 'foot', 'sky', 'leg', 'coat', 'cloud', 'jacket', 'boot', 'helmet', 'shoe', 'glove', 'zipper'] 2022-03-16 05:54:06,777.777 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:47:51 iter: 2500 speed: 313.7 images/sec total_norm: 124.3773 (127.4217) loss: 166.9203 (166.5881) masked_loss: 2.3137 (2.3667) tag_loss: 164.1752 (164.2214) time: 1.4337 (1.6322) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4285 (1.6271) lr: 0.000096 max mem: 26307 2022-03-16 05:54:07,138.138 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5757575631141663 2022-03-16 05:54:07,139.139 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 148.09341430664062 2022-03-16 05:54:07,139.139 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.21351271409254 2022-03-16 05:54:10,420.420 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.014027805998921394 2022-03-16 05:54:10,421.421 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 05:54:10,421.421 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'very', 'pretty', 'glass', 'holding', 'some', '[MASK]', 'pretty', 'flowers', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 05:54:10,436.436 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['flower', 'table', 'bouquet', 'leaf', 'vase', 'glass', 'rose', 'stem', 'base', 'background', '[UNK]', 'water', 'chair', 'wall', 'plate', 'cloth', 'light', 'napkin', 'person', 'bud', 'window', 'fork', 'spoon', 'shadow', 'bottom', 'handle', 'white', 'plant', 'knife', 'reflection', 'wine', 'design', 'shirt', 'berry', 'paper', 'couple', 'object', 'bowl', 'top', 'curtain', 'ribbon', 'green', 'floor', 'man', 'group', 'candle', 'next', 'woman', 'cup', 'wooden'] 2022-03-16 05:54:26,435.435 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'light', 'table', 'glass', 'chair', 'pretty', 'background', 'flower', 'stem', 'vase', 'bouquet'] 2022-03-16 05:56:50,141.141 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:45:38 iter: 2600 speed: 313.4 images/sec total_norm: 125.8118 (127.3205) loss: 170.5232 (171.8502) masked_loss: 2.2644 (2.3067) tag_loss: 167.7460 (169.5435) time: 1.4339 (1.6336) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4288 (1.6285) lr: 0.000096 max mem: 26307 2022-03-16 05:56:50,503.503 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5 2022-03-16 05:56:50,503.503 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 194.7392578125 2022-03-16 05:56:50,503.503 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.09430016411676 2022-03-16 05:56:53,857.857 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01414740364998579 2022-03-16 05:56:53,857.857 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 05:56:53,857.857 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'girl', 'smiling', 'and', 'holding', 'a', '##vati', '[MASK]', 'remote', 'in', 'her', 'hands', 'over', 'her', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 05:56:53,872.872 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hair', 'shirt', 'wall', 'girl', 'eye', 'fireplace', 'arm', 'hand', 'mantle', 'head', 'controller', 'face', 'candle', 'nose', 'lamp', 'remote', 'picture', 'ear', '[UNK]', 'game', 'child', 'shelf', 'mouth', 'smile', 'sleeve', 'floor', 'book', 'strap', 'boy', 'curtain', 'teeth', 'room', 'couch', 'table', 'door', 'frame', 'wii', 'jean', 'chair', 'mirror', 'window', 'shade', 'bracelet', 'light', 'video', 'young', 'wrist', 'television', 'flower', 'toy'] 2022-03-16 05:57:09,942.942 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'hand', 'game', 'face', 'short', 'hair', 'girl', 'video', 'wall', 'arm', 'smile', 'eye', 'shirt', 'picture', 'nose', 'mirror', 'smiling', 'flower', 'remote', 'wrist', 'lamp', 'shelf', 'candle', 'fireplace', 'strap'] 2022-03-16 05:59:33,618.618 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:43:25 iter: 2700 speed: 313.2 images/sec total_norm: 125.1971 (127.6473) loss: 163.8296 (164.3393) masked_loss: 2.2841 (2.2669) tag_loss: 161.7627 (162.0724) time: 1.4349 (1.6348) data: 0.0001 (0.0002) to_device: 0.0052 (0.0050) time_gpu: 1.4297 (1.6296) lr: 0.000096 max mem: 26307 2022-03-16 05:59:33,980.980 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5151515007019043 2022-03-16 05:59:33,980.980 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 151.72967529296875 2022-03-16 05:59:33,980.980 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.14831297738212 2022-03-16 05:59:37,352.352 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.014310806058347225 2022-03-16 05:59:37,352.352 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 05:59:37,353.353 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'zebra', 'are', 'eating', '[MASK]', 'grass', 'underneath', 'an', 'umbrella', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 05:59:37,368.368 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['zebra', 'leg', 'ground', 'fence', 'mane', 'grass', 'head', 'shadow', 'ear', 'stripe', '[UNK]', 'tail', 'neck', 'eye', 'wall', 'enclosure', 'nose', 'dirt', 'mouth', 'zoo', 'pole', 'sand', 'tree', 'rock', 'post', 'body', 'building', 'field', 'pen', 'hay', 'next', 'log', 'trunk', 'other', 'shade', 'face', 'leaf', 'area', 'roof', 'couple', 'hair', 'standing', 'green', 'plant', 'bush', 'door', 'branch', 'spot', 'group', 'sky'] 2022-03-16 05:59:53,261.261 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'ground', 'wall', 'plant', 'leg', 'roof', 'ear', 'shadow', 'grass', 'tail', 'pole', 'dirt', 'fence', 'umbrella', 'enclosure', 'stripe', 'mane', 'zebra'] 2022-03-16 06:02:17,126.126 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:41:10 iter: 2800 speed: 313.1 images/sec total_norm: 129.6093 (133.3876) loss: 162.8348 (166.7582) masked_loss: 2.2670 (2.2786) tag_loss: 160.4984 (164.4796) time: 1.4355 (1.6351) data: 0.0001 (0.0002) to_device: 0.0052 (0.0050) time_gpu: 1.4302 (1.6299) lr: 0.000096 max mem: 26307 2022-03-16 06:02:17,490.490 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4848484992980957 2022-03-16 06:02:17,490.490 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 188.01812744140625 2022-03-16 06:02:17,490.490 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.09601908716662 2022-03-16 06:02:20,905.905 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.014494640752673149 2022-03-16 06:02:20,905.905 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 06:02:20,906.906 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'people', 'at', 'a', 'table', '[MASK]', 'a', 'laptop', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 06:02:20,921.921 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'hair', 'glasses', 'man', 'face', 'hand', 'woman', 'wall', '[UNK]', 'head', 'person', 'table', 'mouth', 'paper', 'arm', 'jean', 'ear', 'laptop', 'boy', 'nose', 'chair', 'computer', 'girl', 'keyboard', 'plate', 'window', 'board', 'button', 'knife', 'desk', 'cup', 'food', 'watch', 'book', 'floor', 'sleeve', 'collar', 'phone', 'door', 'light', 'glass', 'bottle', 'picture', 'finger', 'bowl', 'box', 'bracelet', 'ring', 'handle', 'napkin'] 2022-03-16 06:02:36,912.912 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'hand', 'face', 'book', 'hair', 'mouth', 'table', 'wall', 'arm', 'boy', 'eye', 'chair', 'paper', 'computer', 'watch', 'shirt', 'nose', 'ear', 'hat', 'wrist', 'glasses', 'mouse', 'sleeve', 'shelf', 'pad', 'laptop'] 2022-03-16 06:05:00,413.413 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:38:48 iter: 2900 speed: 313.6 images/sec total_norm: 128.8568 (130.7534) loss: 167.5192 (169.3143) masked_loss: 2.2662 (2.3142) tag_loss: 165.2468 (167.0001) time: 1.4342 (1.6329) data: 0.0001 (0.0002) to_device: 0.0052 (0.0050) time_gpu: 1.4290 (1.6277) lr: 0.000096 max mem: 26307 2022-03-16 06:05:00,775.775 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.42424243688583374 2022-03-16 06:05:00,776.776 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 137.3585205078125 2022-03-16 06:05:00,776.776 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.22449264526367 2022-03-16 06:05:04,227.227 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01450321264564991 2022-03-16 06:05:04,227.227 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 06:05:04,228.228 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'man', 'riding', 'a', 'skate', '##board', 'on', 'a', 'stone', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 06:05:04,243.243 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'stair', 'person', 'arm', 'man', 'leg', 'shirt', 'ground', 'tree', 'railing', 'sidewalk', 'building', 'hair', 'shoe', 'head', 'shadow', 'hand', 'pole', 'wall', 'wheel', 'boy', 'staircase', 'woman', 'step', 'window', 'board', 'foot', 'hat', 'sign', 'street', 'jacket', 'fence', 'photo', 'trunk', 'light', 'ramp', 'walkway', 'bridge', 'post', 'bench', 'bag', 'girl', 'trick', 'black', 'background', 'pillar', 'roof', 'park', 'door', 'platform'] 2022-03-16 06:05:20,213.213 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'building', 'woman', 'ground', 'board', 'person', 'arm', 'boy', 'bridge', 'stone', 'window', 'shirt', 'leg', 'bag', 'wheel', 'column', 'hat', 'statue', 'jacket', 'bench', 'fence', 'fountain', 'sidewalk', 'ramp', 'pillar', 'stair'] 2022-03-16 06:07:43,900.900 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:36:30 iter: 3000 speed: 313.2 images/sec total_norm: 127.2282 (132.1906) loss: 161.5379 (165.5154) masked_loss: 2.1634 (2.2410) tag_loss: 158.9387 (163.2745) time: 1.4341 (1.6349) data: 0.0001 (0.0002) to_device: 0.0052 (0.0050) time_gpu: 1.4291 (1.6297) lr: 0.000095 max mem: 26307 2022-03-16 06:07:44,261.261 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4444444477558136 2022-03-16 06:07:44,261.261 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 180.79383850097656 2022-03-16 06:07:44,261.261 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.0646368457425 2022-03-16 06:07:47,760.760 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.014777246862649918 2022-03-16 06:07:47,760.760 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 06:07:47,760.760 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'cat', 'poking', '[MASK]', "'", 's', 'now', 'on', 'a', 'stuffed', 'toy', 'bird', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 06:07:47,776.776 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['eye', 'head', 'bird', 'window', 'tree', 'ear', 'cat', 'nose', 'beak', '[UNK]', 'tail', 'face', 'feather', 'toy', 'animal', 'leg', 'parrot', 'sky', 'wall', 'wing', 'paw', 'body', 'ledge', 'table', 'plant', 'mouth', 'leaf', 'cage', 'foot', 'curtain', 'duck', 'floor', 'glass', 'arm', 'button', 'green', 'frame', 'fur', 'chest', 'fence', 'trunk', 'rabbit', 'bush', 'collar', 'small', 'light', 'top', 'teddy', 'screen', 'bowl'] 2022-03-16 06:08:03,738.738 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'eye', 'neck', 'window', 'wing', 'tree', 'sky', 'dog', 'ear', 'bird', 'cat', 'grass', 'tail', 'toy', 'collar', 'stuffed', 'paw', 'beak'] 2022-03-16 06:10:27,447.447 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:34:11 iter: 3100 speed: 313.1 images/sec total_norm: 126.7292 (129.8545) loss: 163.0264 (165.4687) masked_loss: 2.2443 (2.2429) tag_loss: 160.7388 (163.2258) time: 1.4351 (1.6355) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4297 (1.6302) lr: 0.000095 max mem: 26307 2022-03-16 06:10:27,810.810 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-16 06:10:27,810.810 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 156.93826293945312 2022-03-16 06:10:27,811.811 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.15096294879913 2022-03-16 06:10:31,319.319 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.014928682707250118 2022-03-16 06:10:31,320.320 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 06:10:31,320.320 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'woman', 'with', 'large', ',', '[MASK]', 'kite', '[MASK]', 'close', 'to', 'the', 'ground', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 06:10:31,335.335 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['kite', 'sky', 'tree', 'string', 'tail', 'person', 'shirt', 'man', 'ground', 'woman', 'grass', '[UNK]', 'park', 'building', 'hair', 'shadow', 'head', 'jacket', 'car', 'hat', 'jean', 'short', 'boy', 'tent', 'field', 'child', 'ribbon', 'blue', 'hand', 'bush', 'girl', 'fence', 'leg', 'flag', 'cloud', 'beach', 'large', 'face', 'arm', 'bunch', 'rainbow', 'umbrella', 'group', 'couple', 'colorful', 'eye', 'bag', 'shoe', 'street', 'design'] 2022-03-16 06:10:47,393.393 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'house', 'large', 'park', 'woman', 'short', 'ground', 'hair', 'girl', 'person', 'tree', 'sky', 'shirt', 'leg', 'camera', 'string', 'shadow', 'grass', 'tail', 'bush', 'hat', 'jacket', 'shoe', 'colorful', 'kite', 'sock'] 03-16 06:12:21.311 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 06:12:21.312 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 06:12:22.579 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 06:13:10,997.997 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:31:50 iter: 3200 speed: 313.1 images/sec total_norm: 127.8397 (133.1336) loss: 166.6285 (165.9115) masked_loss: 2.2942 (2.3085) tag_loss: 164.0938 (163.6030) time: 1.4334 (1.6355) data: 0.0001 (0.0002) to_device: 0.0052 (0.0051) time_gpu: 1.4281 (1.6302) lr: 0.000095 max mem: 26307 2022-03-16 06:13:11,359.359 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6176470518112183 2022-03-16 06:13:11,359.359 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 146.1209716796875 2022-03-16 06:13:11,359.359 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.17579731796727 2022-03-16 06:13:14,934.934 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.014963779598474503 2022-03-16 06:13:14,934.934 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 06:13:14,935.935 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'girl', 'sitting', 'down', ',', 'outside', '[MASK]', 'eating', 'a', 'sandwich', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 06:13:14,950.950 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hair', 'eye', 'girl', 'nose', 'hand', 'shirt', 'face', 'tree', 'man', 'head', 'mouth', 'bread', 'cake', 'food', '[UNK]', 'table', 'woman', 'finger', 'glasses', 'sandwich', 'window', 'chair', 'background', 'pillar', 'person', 'flower', 'ear', 'column', 'dress', 'plate', 'young', 'arm', 'sky', 'napkin', 'bang', 'building', 'ring', 'eyebrow', 'necklace', 'trunk', 'bush', 'child', 'little', 'top', 'jacket', 'pizza', 'grass', 'watch', 'pole', 'wall'] 2022-03-16 06:13:30,960.960 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'hair', 'girl', 'outside', 'mouth', 'food', 'eye', 'chair', 'tree', 'jean', 'shirt', 'finger', 'nose', 'column', 'bread', 'glasses', 'eyebrow', 'cake', 'sandwich', 'necklace', 'pillar', 'strap'] 2022-03-16 06:15:54,774.774 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:29:33 iter: 3300 speed: 312.6 images/sec total_norm: 126.2624 (128.1804) loss: 162.7510 (165.4645) masked_loss: 2.3192 (2.2897) tag_loss: 160.6864 (163.1749) time: 1.4357 (1.6378) data: 0.0001 (0.0002) to_device: 0.0052 (0.0050) time_gpu: 1.4307 (1.6326) lr: 0.000095 max mem: 26307 2022-03-16 06:15:55,135.135 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.38235294818878174 2022-03-16 06:15:55,135.135 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 172.38565063476562 2022-03-16 06:15:55,135.135 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.2174820619471 2022-03-16 06:15:58,650.650 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.014947595074772835 2022-03-16 06:15:58,650.650 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 06:15:58,651.651 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'there', 'are', 'several', 'warship', 'players', 'practicing', 'for', '[MASK]', 'game', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 06:15:58,666.666 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['baseball', 'man', '[UNK]', 'shirt', 'bat', 'net', 'ball', 'hat', 'player', 'field', 'pole', 'stadium', 'cap', 'glove', 'shoe', 'person', 'head', 'sign', 'jersey', 'ground', 'hand', 'uniform', 'grass', 'leg', 'stand', 'logo', 'batter', 'dirt', 'seat', 'fence', 'wall', 'line', 'goal', 'shadow', 'number', 'mound', 'game', 'arm', 'light', 'umpire', 'helmet', 'tennis', 'netting', 'building', 'base', 'sky', 'pitcher', 'jacket', 'chair', 'catcher'] 2022-03-16 06:16:14,681.681 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'several', 'game', 'player', 'ground', 'person', 'seat', 'arm', 'stadium', 'baseball', 'ball', 'sign', 'shirt', 'jersey', 'wheel', 'grass', 'net', 'hat', 'cap', 'pole', 'bat', 'shoe', 'tire', 'mat', 'glove', 'umpire'] 2022-03-16 06:18:38,300.300 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:27:09 iter: 3400 speed: 313.1 images/sec total_norm: 124.4796 (126.8232) loss: 166.1431 (169.2912) masked_loss: 2.2369 (2.2224) tag_loss: 163.6011 (167.0687) time: 1.4338 (1.6353) data: 0.0001 (0.0005) to_device: 0.0050 (0.0049) time_gpu: 1.4289 (1.6299) lr: 0.000095 max mem: 26307 2022-03-16 06:18:38,661.661 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.37142857909202576 2022-03-16 06:18:38,661.661 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 194.81141662597656 2022-03-16 06:18:38,662.662 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.13086558750697 2022-03-16 06:18:42,268.268 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.015078512020409107 2022-03-16 06:18:42,268.268 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 06:18:42,269.269 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'of', 'green', '[MASK]', 'signs', 'sitting', 'on', 'top', 'of', 'a', 'pole', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 06:18:42,284.284 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'sign', 'cloud', 'pole', 'tree', 'letter', 'street', 'building', 'car', '[UNK]', 'road', 'line', 'window', 'sidewalk', 'light', 'arrow', 'bush', 'wire', 'stop', 'roof', 'number', 'word', 'traffic', 'power', 'house', 'tire', 'wall', 'green', 'truck', 'curb', 'grass', 'person', 'post', 'intersection', 'fence', 'van', 'can', 'blue', 'red', 'wheel', 'shadow', 'bridge', 'fire', 'way', 'bolt', 'side', 'background', 'corner', 'writing', 'parking'] 2022-03-16 06:18:58,399.399 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['house', 'number', 'line', 'top', 'road', 'power', 'street', 'car', 'green', 'bridge', 'couple', 'tree', 'letter', 'sign', 'sky', 'circle', 'truck', 'grass', 'cloud', 'pole', 'sidewalk', 'curb'] 2022-03-16 06:21:21,819.819 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:24:44 iter: 3500 speed: 313.1 images/sec total_norm: 124.2238 (125.7086) loss: 168.4714 (170.0881) masked_loss: 2.1295 (2.1714) tag_loss: 166.4771 (167.9167) time: 1.4342 (1.6352) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4292 (1.6301) lr: 0.000095 max mem: 26307 2022-03-16 06:21:22,180.180 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4838709533214569 2022-03-16 06:21:22,180.180 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 178.1074981689453 2022-03-16 06:21:22,181.181 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.00398275587294 2022-03-16 06:21:25,801.801 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.015096554532647133 2022-03-16 06:21:25,802.802 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 06:21:25,802.802 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'riding', 'in', 'a', 'boat', '[MASK]', 'a', 'dog', 'on', 'his', 'lap', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 06:21:25,818.818 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hand', 'dog', 'hair', 'sunglasses', 'man', 'head', 'jacket', 'collar', 'water', 'short', 'sky', 'boat', 'leg', 'ear', 'face', 'nose', 'seat', 'can', 'window', 'pole', 'cup', 'shirt', '[UNK]', 'car', 'handle', 'arm', 'shadow', 'chair', 'mouth', 'tree', 'neck', 'foot', 'sleeve', 'glasses', 'bar', 'drink', 'finger', 'bench', 'door', 'vehicle', 'paw', 'beach', 'ocean', 'ring', 'wave', 'land', 'lid', 'beer', 'table', 'person'] 2022-03-16 06:21:41,866.866 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'can', 'head', 'man', 'hand', 'face', 'water', 'cup', 'short', 'hair', 'seat', 'chair', 'bar', 'sky', 'shirt', 'dog', 'boat', 'coffee', 'leg', 'ear', 'lap', 'pole', 'jacket', 'collar', 'sunglasses'] 2022-03-16 06:24:05,675.675 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:22:24 iter: 3600 speed: 312.5 images/sec total_norm: 127.5525 (129.4285) loss: 162.5210 (162.6636) masked_loss: 2.2596 (2.2590) tag_loss: 160.2764 (160.4045) time: 1.4350 (1.6386) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4297 (1.6335) lr: 0.000095 max mem: 26307 2022-03-16 06:24:06,036.036 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5454545617103577 2022-03-16 06:24:06,037.037 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 165.65032958984375 2022-03-16 06:24:06,037.037 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 67.91705899625211 2022-03-16 06:24:09,676.676 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.015121040865778923 2022-03-16 06:24:09,676.676 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 06:24:09,677.677 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'the', 'two', '[MASK]', 'are', 'proud', 'of', 'their', 'unusual', 'cake', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 06:24:09,692.692 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'hair', 'hand', 'man', 'wall', 'woman', 'head', '[UNK]', 'cake', 'arm', 'picture', 'table', 'face', 'pizza', 'mouth', 'ear', 'crab', 'person', 'jacket', 'glass', 'girl', 'plate', 'nose', 'eye', 'leg', 'couple', 'knife', 'jean', 'food', 'glasses', 'ceiling', 'floor', 'sign', 'window', 'short', 'door', 'bottle', 'board', 'light', 'finger', 'chair', 'lady', 'poster', 'flower', 'bag', 'sweater', 'tray', 'box', 'apron', 'cup'] 2022-03-16 06:24:25,750.750 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'hand', 'woman', 'hair', 'person', 'table', 'wall', 'food', 'couple', 'shirt', 'picture', 'leg', 'dress', 'nose', 'ear', 'knife', 'unusual', 'proud', 'cake', 'badge', 'tray', 'necklace', 'candle', 'crab'] 2022-03-16 06:26:49,556.556 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:20:03 iter: 3700 speed: 312.4 images/sec total_norm: 124.1741 (129.3139) loss: 162.8971 (164.3228) masked_loss: 2.0609 (2.0956) tag_loss: 160.4019 (162.2273) time: 1.4354 (1.6388) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4303 (1.6336) lr: 0.000094 max mem: 26307 2022-03-16 06:26:49,916.916 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4375 2022-03-16 06:26:49,917.917 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 142.99635314941406 2022-03-16 06:26:49,917.917 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.03635366339432 2022-03-16 06:26:53,600.600 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.015165851451456547 2022-03-16 06:26:53,601.601 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 06:26:53,601.601 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', '[MASK]', 'surfing', 'on', 'a', 'board', 'in', 'the', 'water', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 06:26:53,616.616 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'water', 'mountain', 'tree', 'hill', 'wave', '[UNK]', 'beach', 'boat', 'head', 'person', 'sand', 'leg', 'hair', 'man', 'cloud', 'shirt', 'arm', 'hand', 'board', 'shore', 'rock', 'reflection', 'ocean', 'ear', 'short', 'ground', 'background', 'foot', 'woman', 'face', 'house', 'building', 'jacket', 'boy', 'dog', 'grass', 'hat', 'pole', 'tail', 'lake', 'girl', 'umbrella', 'body', 'child', 'rope', 'forest', 'bird', 'large', 'shoe'] 2022-03-16 06:27:09,623.623 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'hand', 'water', 'short', 'board', 'hair', 'arm', 'hill', 'mountain', 'tree', 'sky', 'shirt', 'leg', 'wave'] 2022-03-16 06:29:33,461.461 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:17:41 iter: 3800 speed: 312.4 images/sec total_norm: 127.1361 (129.9609) loss: 166.2564 (168.0676) masked_loss: 2.0833 (2.1578) tag_loss: 163.2929 (165.9098) time: 1.4368 (1.6390) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4314 (1.6339) lr: 0.000094 max mem: 26307 2022-03-16 06:29:33,823.823 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5428571701049805 2022-03-16 06:29:33,824.824 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 171.9650115966797 2022-03-16 06:29:33,824.824 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.0916255070613 2022-03-16 06:29:37,532.532 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.015115528367459774 2022-03-16 06:29:37,532.532 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 06:29:37,532.532 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'with', 'a', 'tennis', 'ball', 'in', 'one', 'hand', 'and', '[MASK]', 'tennis', 'rack', '##et', 'in', 'the', '[MASK]', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 06:29:37,548.548 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hand', 'shirt', '[UNK]', 'man', 'tennis', 'short', 'arm', 'court', 'head', 'leg', 'wall', 'hair', 'ball', 'face', 'band', 'ear', 'nose', 'logo', 'handle', 'wrist', 'mouth', 'shoe', 'sock', 'ground', 'player', 'line', 'eye', 'sleeve', 'letter', 'cap', 'hat', 'watch', 'stripe', 'string', 'bracelet', 'fence', 'collar', 'beard', 'finger', 'male', 'shadow', 'chair', 'person', 'sign', 'glasses', 'knee', 'necklace', 'stand', 'background', 'woman'] 2022-03-16 06:29:53,533.533 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'other', 'head', 'man', 'hand', 'face', 'band', 'court', 'short', 'hair', 'mouth', 'wall', 'arm', 'eye', 'watch', 'ball', 'letter', 'shirt', 'background', 'nose', 'handle', 'tennis', 'net', 'wrist', 'logo', 'beard', 'sleeve', 'curtain', 'stripe'] 2022-03-16 06:32:17,360.360 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:15:18 iter: 3900 speed: 312.4 images/sec total_norm: 127.2599 (131.2631) loss: 167.6900 (166.4754) masked_loss: 2.0399 (2.1247) tag_loss: 165.4315 (164.3507) time: 1.4353 (1.6390) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4301 (1.6339) lr: 0.000094 max mem: 26307 2022-03-16 06:32:17,722.722 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5277777910232544 2022-03-16 06:32:17,722.722 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 177.50115966796875 2022-03-16 06:32:17,723.723 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.04300327301026 2022-03-16 06:32:21,453.453 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.015265017747879028 2022-03-16 06:32:21,453.453 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 06:32:21,453.453 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'black', 'and', 'white', 'flaps', 'a', 'small', 'bathroom', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 06:32:21,469.469 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'bathroom', 'mirror', 'toilet', 'sink', 'floor', '[UNK]', 'outlet', 'paper', 'pipe', 'lid', 'seat', 'tank', 'bottle', 'door', 'light', 'trash', 'towel', 'bowl', 'shadow', 'sign', 'handle', 'bag', 'can', 'switch', 'reflection', 'tissue', 'holder', 'man', 'tile', 'soap', 'curtain', 'box', 'ceiling', 'shelf', 'person', 'basket', 'cup', 'shower', 'white', 'picture', 'roll', 'drain', 'window', 'head', 'hand', 'base', 'stall', 'small', 'hair'] 2022-03-16 06:32:37,505.505 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'small', 'black', 'white', 'light', 'floor', 'wall', 'seat', 'bar', 'box', 'tank', 'mirror', 'bathroom', 'switch', 'sink', 'tissue', 'pipe', 'reflection', 'dish', 'towel', 'curtain', 'shelf', 'toilet', 'lid', 'outlet'] 2022-03-16 06:35:01,423.423 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:12:56 iter: 4000 speed: 312.1 images/sec total_norm: 127.7047 (130.4037) loss: 161.2848 (163.2589) masked_loss: 2.0475 (2.1304) tag_loss: 158.8457 (161.1286) time: 1.4358 (1.6406) data: 0.0001 (0.0001) to_device: 0.0051 (0.0051) time_gpu: 1.4306 (1.6354) lr: 0.000094 max mem: 26307 2022-03-16 06:35:01,785.785 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6176470518112183 2022-03-16 06:35:01,786.786 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 181.9211883544922 2022-03-16 06:35:01,786.786 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.03520258461556 2022-03-16 06:35:05,562.562 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.015245223417878151 2022-03-16 06:35:05,562.562 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 06:35:05,563.563 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'zebra', '[MASK]', 'nu', '##zzle', 'each', 'other', 'while', 'another', 'zebra', 'stands', 'in', 'the', 'background', '[MASK]', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 06:35:05,578.578 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['leg', 'zebra', 'ear', 'ground', 'shadow', 'head', 'mane', 'grass', '[UNK]', 'nose', 'eye', 'tree', 'stripe', 'tail', 'field', 'dirt', 'face', 'mouth', 'branch', 'rock', 'other', 'hair', 'trunk', 'body', 'bush', 'neck', 'background', 'foot', 'baby', 'next', 'area', 'leaf', 'standing', 'road', 'back', 'grassy', 'couple', 'adult', 'group', 'dry', 'small', 'snout', 'stick', 'herd', 'spot', 'mother', 'white', 'sand', 'surface', 'black'] 2022-03-16 06:35:21,595.595 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'other', 'head', 'face', 'field', 'ground', 'rock', 'eye', 'tree', 'leg', 'background', 'nose', 'ear', 'shadow', 'grass', 'tail', 'stripe', 'mane', 'zebra'] 2022-03-16 06:37:45,517.517 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:10:34 iter: 4100 speed: 312.0 images/sec total_norm: 128.1899 (130.5380) loss: 169.8493 (168.4372) masked_loss: 2.1101 (2.1154) tag_loss: 166.4966 (166.3218) time: 1.4340 (1.6409) data: 0.0001 (0.0001) to_device: 0.0052 (0.0051) time_gpu: 1.4287 (1.6357) lr: 0.000094 max mem: 26307 2022-03-16 06:37:45,875.875 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.3529411852359772 2022-03-16 06:37:45,875.875 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 177.58633422851562 2022-03-16 06:37:45,876.876 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.04508300054641 2022-03-16 06:37:49,695.695 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.015245894901454449 2022-03-16 06:37:49,695.695 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 06:37:49,695.695 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'herd', 'of', 'sheep', 'walking', 'along', 'a', 'lush', '[MASK]', 'field', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 06:37:49,711.711 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'sky', 'grass', 'sheep', 'field', 'fence', 'herd', 'green', 'animal', 'grassy', 'pasture', 'leg', '[UNK]', 'head', 'pole', 'bunch', 'lush', 'trunk', 'wood', 'post', 'hill', 'building', 'flock', 'leaf', 'large', 'background', 'grazing', 'house', 'bush', 'cow', 'group', 'cloud', 'road', 'mountain', 'open', 'forest', 'wool', 'person', 'wire', 'roof', 'white', 'lamb', 'rock', 'area', 'big', 'distance', 'middle', 'day', 'tail', 'goat'] 2022-03-16 06:38:05,683.683 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['field', 'green', 'hill', 'tree', 'sky', 'walking', 'grass', 'bush', 'cloud', 'pole', 'trunk', 'sheep', 'herd', 'lush'] 2022-03-16 06:40:29,468.468 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:08:08 iter: 4200 speed: 312.3 images/sec total_norm: 125.5093 (128.2729) loss: 163.6538 (164.5845) masked_loss: 2.0316 (2.1320) tag_loss: 162.0480 (162.4525) time: 1.4346 (1.6395) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4295 (1.6345) lr: 0.000094 max mem: 26307 2022-03-16 06:40:29,829.829 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5151515007019043 2022-03-16 06:40:29,829.829 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 168.20590209960938 2022-03-16 06:40:29,829.829 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.05043224955715 2022-03-16 06:40:33,683.683 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.015223406255245209 2022-03-16 06:40:33,683.683 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 06:40:33,683.683 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'black', 'truck', 'parked', 'on', 'the', 'curb', 'with', '[MASK]', 'sign', 'beside', '也', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 06:40:33,699.699 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'tree', 'sign', 'windshield', 'tire', 'window', 'sky', 'building', 'plate', 'roof', 'pole', 'person', 'ground', 'license', 'truck', 'car', 'grill', 'light', 'road', 'man', 'mirror', 'street', 'bumper', 'hood', 'jacket', 'shirt', 'front', 'writing', 'woman', 'wheel', 'sidewalk', 'house', 'bus', 'hat', 'wall', 'door', 'bag', 'stop', 'bush', 'jean', 'van', 'shoe', 'coat', 'chimney', 'next', 'fence', 'logo', 'parking', 'lot', 'child'] 2022-03-16 06:40:49,639.639 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'house', 'black', 'building', 'road', 'light', 'ground', 'board', 'person', 'wall', 'writing', 'window', 'tree', 'sign', 'sky', 'shirt', 'roof', 'bag', 'truck', 'plate', 'wheel', 'mirror', 'brick', 'license', 'pole', 'hood', 'tire', 'curb', 'grill', 'windshield'] 03-16 06:42:22.581 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 06:42:22.581 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 06:42:23.638 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 06:43:13,354.354 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:05:41 iter: 4300 speed: 312.4 images/sec total_norm: 124.6680 (127.3306) loss: 164.0845 (166.2099) masked_loss: 2.1034 (2.1543) tag_loss: 162.1899 (164.0556) time: 1.4347 (1.6389) data: 0.0001 (0.0002) to_device: 0.0049 (0.0047) time_gpu: 1.4297 (1.6340) lr: 0.000094 max mem: 26307 2022-03-16 06:43:13,717.717 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6388888955116272 2022-03-16 06:43:13,717.717 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 165.03314208984375 2022-03-16 06:43:13,717.717 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.14587593078613 2022-03-16 06:43:17,609.609 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01524761039763689 2022-03-16 06:43:17,609.609 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 06:43:17,610.610 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'teddy', '[MASK]', 'sitting', 'at', 'a', 'table', '[MASK]', 'drinking', '[MASK]', 'on', 'it', 'with', 'more', 'teddy', 'bears', 'in', 'the', 'background', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 06:43:17,625.625 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['bear', 'chair', 'table', 'floor', 'teddy', 'animal', 'glass', 'bow', 'stuffed', 'ribbon', 'head', 'leg', 'tile', 'person', 'ball', '[UNK]', 'foot', 'pole', 'bottle', 'dog', 'shoe', 'hat', 'shirt', 'ear', 'monkey', 'tag', 'toy', 'basket', 'stool', 'nose', 'doll', 'room', 'window', 'paw', 'arm', 'store', 'scarf', 'paper', 'bag', 'cup', 'man', 'woman', 'cushion', 'sign', 'group', 'ground', 'display', 'bar', 'sweater', 'wall'] 2022-03-16 06:43:33,619.619 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'floor', 'table', 'glass', 'chair', 'animal', 'leg', 'background', 'bear', 'tail', 'bottle', 'drinking', 'hat', 'statue', 'bow', 'lighter', 'ribbon', 'teddy', 'stuffed', 'bucket'] 2022-03-16 06:45:57,540.540 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:03:18 iter: 4400 speed: 311.8 images/sec total_norm: 123.7966 (126.1624) loss: 166.1933 (169.4271) masked_loss: 2.1666 (2.2050) tag_loss: 164.4248 (167.2221) time: 1.4364 (1.6419) data: 0.0001 (0.0002) to_device: 0.0049 (0.0048) time_gpu: 1.4314 (1.6369) lr: 0.000093 max mem: 26307 2022-03-16 06:45:57,902.902 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6666666865348816 2022-03-16 06:45:57,902.902 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 146.14837646484375 2022-03-16 06:45:57,902.902 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.210033331977 2022-03-16 06:46:01,826.826 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.015206034295260906 2022-03-16 06:46:01,826.826 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 06:46:01,827.827 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'floating', 'along', 'a', 'shore', 'line', 'with', '[MASK]', 'of', 'cranes', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 06:46:01,842.842 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['flag', 'sky', 'water', 'boat', 'crane', 'dock', 'window', 'cloud', '[UNK]', 'pole', 'harbor', 'person', 'building', 'bridge', 'man', 'tire', 'cabin', 'pier', 'sign', 'light', 'door', 'ship', 'box', 'rope', 'truck', 'large', 'street', 'tower', 'river', 'wall', 'car', 'stripe', 'mast', 'number', 'life', 'cone', 'wheel', 'roof', 'american', 'shirt', 'small', 'tree', 'container', 'structure', 'top', 'post', 'red', 'stair', 'name', 'white'] 2022-03-16 06:46:17,884.884 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'american', 'life', 'line', 'water', 'building', 'river', 'door', 'light', 'fire', 'writing', 'window', 'letter', 'sky', 'boat', 'flag', 'shore', 'cloud', 'pole', 'rope', 'dock', 'crane', 'stripe'] 2022-03-16 06:48:41,557.557 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:00:51 iter: 4500 speed: 312.2 images/sec total_norm: 124.5511 (126.5793) loss: 162.7146 (162.9010) masked_loss: 2.1206 (2.0979) tag_loss: 160.3948 (160.8031) time: 1.4348 (1.6401) data: 0.0001 (0.0005) to_device: 0.0049 (0.0048) time_gpu: 1.4298 (1.6348) lr: 0.000093 max mem: 26307 2022-03-16 06:48:41,917.917 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-16 06:48:41,918.918 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 181.21803283691406 2022-03-16 06:48:41,918.918 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.16388685806938 2022-03-16 06:48:45,878.878 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01532444916665554 2022-03-16 06:48:45,878.878 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 06:48:45,878.878 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'tow', 'lady', "'", 's', 'enjoying', 'a', 'chocolate', '[MASK]', 'and', 'some', 'coffee', '##ע', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 06:48:45,894.894 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['table', 'necklace', 'woman', 'plate', '[UNK]', 'sweater', 'shirt', 'hair', 'wall', 'hand', 'fork', 'knife', 'handle', 'cup', 'window', 'coffee', 'cake', 'face', 'chair', 'spoon', 'food', 'neck', 'nose', 'mug', 'lid', 'restaurant', 'napkin', 'glasses', 'bowl', 'glass', 'person', 'pot', 'head', 'kettle', 'ear', 'girl', 'bottle', 'pitcher', 'mouth', 'flower', 'fireplace', 'plant', 'fruit', 'eye', 'tea', 'candle', 'belt', 'cabinet', 'door', 'dessert'] 2022-03-16 06:49:01,979.979 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'hand', 'face', 'woman', 'cup', 'hair', 'girl', 'person', 'table', 'wall', 'lady', 'chair', 'plant', 'shirt', 'kitchen', 'coffee', 'bowl', 'handle', 'plate', 'cabinet', 'knife', 'bottle', 'fruit', 'liquid', 'sink', 'glasses', 'chocolate', 'purse', 'fork', 'cake', 'basket', 'necklace', 'drawer', 'sweater', 'mug', 'soda', 'banana', 'spoon', 'tow', 'dessert', 'napkin'] 2022-03-16 06:51:25,781.781 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:58:26 iter: 4600 speed: 311.8 images/sec total_norm: 124.8762 (126.4941) loss: 158.2184 (160.1100) masked_loss: 2.0506 (2.1005) tag_loss: 156.2362 (158.0095) time: 1.4351 (1.6423) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4298 (1.6371) lr: 0.000093 max mem: 26307 2022-03-16 06:51:26,141.141 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7777777910232544 2022-03-16 06:51:26,141.141 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 146.08828735351562 2022-03-16 06:51:26,141.141 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.12265615260347 2022-03-16 06:51:30,159.159 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.015313643962144852 2022-03-16 06:51:30,160.160 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 06:51:30,160.160 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'wire', 'racks', 'filled', '[MASK]', 'don', '[MASK]', 'and', 'don', '##ut', 'holes', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 06:51:30,175.175 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'rack', 'hole', 'tray', 'oven', 'metal', 'table', 'different', 'reflection', 'box', 'apple', 'shelf', 'chocolate', 'pastry', 'tomato', 'wall', 'candy', 'light', 'top', 'grill', 'large', 'bunch', 'counter', 'cookie', 'display', 'pan', 'ball', 'food', 'other', 'sign', 'paper', 'variety', 'plastic', 'many', 'stem', 'label', 'bottom', 'bread', 'machine', 'open', 'baking', 'various', 'bar', 'cake', 'container', 'wire', 'orange', 'hand', 'hot', 'glazed'] 2022-03-16 06:51:46,143.143 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'orange', 'hole', 'apple', 'wire', 'shelf', 'tray', 'rack'] 2022-03-16 06:54:09,913.913 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:55:59 iter: 4700 speed: 311.9 images/sec total_norm: 126.5066 (130.3545) loss: 163.5384 (163.6275) masked_loss: 2.0340 (2.0479) tag_loss: 162.0044 (161.5796) time: 1.4340 (1.6414) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4289 (1.6363) lr: 0.000093 max mem: 26307 2022-03-16 06:54:10,276.276 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5454545617103577 2022-03-16 06:54:10,276.276 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 161.82684326171875 2022-03-16 06:54:10,276.276 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.18111085891724 2022-03-16 06:54:14,288.288 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.015371406450867653 2022-03-16 06:54:14,288.288 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 06:54:14,288.288 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'boy', 'with', 'his', 'baseball', 'mit', '##t', 'and', 'ball', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 06:54:14,303.303 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'nose', 'face', 'boy', 'eye', 'lip', 'hair', 'head', 'mouth', 'person', 'hand', 'glove', 'girl', '[UNK]', 'ball', 'collar', 'baseball', 'ear', 'finger', 'woman', 'glasses', 'strap', 'man', 'eyebrow', 'arm', 'hat', 'tree', 'logo', 'cap', 'building', 'sunglasses', 'neck', 'button', 'letter', 'sleeve', 'wall', 'chin', 'bat', 'window', 'child', 'handle', 'young', 'jacket', 'stripe', 'thumb', 'sky', 'zipper', 'pole', 'background', 'fence'] 2022-03-16 06:54:30,276.276 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'name', 'face', 'building', 'hair', 'mouth', 'person', 'boy', 'writing', 'eye', 'window', 'baseball', 'ball', 'letter', 'shirt', 'nose', 'ear', 'hole', 'lip', 'hat', 'cap', 'eyebrow', 'glove', 'strap'] 2022-03-16 06:56:54,054.054 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:53:31 iter: 4800 speed: 311.9 images/sec total_norm: 126.9829 (127.8083) loss: 163.8952 (164.3496) masked_loss: 2.1667 (2.1435) tag_loss: 161.8268 (162.2061) time: 1.4342 (1.6414) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4289 (1.6362) lr: 0.000093 max mem: 26307 2022-03-16 06:56:54,415.415 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4545454680919647 2022-03-16 06:56:54,415.415 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 179.7439422607422 2022-03-16 06:56:54,415.415 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.14169918760962 2022-03-16 06:56:58,507.507 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.015463724732398987 2022-03-16 06:56:58,507.507 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 06:56:58,507.507 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'the', 'buses', '[MASK]', 'lined', 'up', 'waiting', '[MASK]', 'passengers', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 06:56:58,522.522 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['bus', 'sky', 'window', 'light', 'pole', 'windshield', '[UNK]', 'street', 'plate', 'sign', 'road', 'license', 'number', 'tire', 'building', 'mirror', 'door', 'letter', 'wheel', 'word', 'person', 'car', 'front', 'man', 'tree', 'driver', 'line', 'roof', 'cloud', 'shirt', 'sidewalk', 'bumper', 'curb', 'logo', 'stop', 'top', 'fence', 'decker', 'red', 'woman', 'traffic', 'lot', 'steering', 'double', 'next', 'advertisement', 'reflection', 'grass', 'van', 'hood'] 2022-03-16 06:57:14,567.567 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'number', 'line', 'door', 'road', 'street', 'light', 'window', 'tree', 'letter', 'sky', 'bus', 'plate', 'wheel', 'mirror', 'license', 'pole', 'tire', 'antenna', 'windshield'] 2022-03-16 06:59:38,087.087 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:51:02 iter: 4900 speed: 312.1 images/sec total_norm: 128.6028 (130.9083) loss: 161.6124 (162.2776) masked_loss: 2.0340 (2.0938) tag_loss: 159.4704 (160.1839) time: 1.4327 (1.6403) data: 0.0001 (0.0002) to_device: 0.0049 (0.0048) time_gpu: 1.4276 (1.6353) lr: 0.000093 max mem: 26307 2022-03-16 06:59:38,450.450 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5757575631141663 2022-03-16 06:59:38,450.450 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 119.79084777832031 2022-03-16 06:59:38,450.450 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.27909439086915 2022-03-16 06:59:42,684.684 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.015442762523889542 2022-03-16 06:59:42,685.685 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 06:59:42,685.685 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'pink', 'plastic', 'tray', 'has', 'food', '[MASK]', 'compartments', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 06:59:42,700.700 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['carrot', 'container', '[UNK]', 'food', 'lid', 'table', 'bowl', 'vegetable', 'box', 'plastic', 'potato', 'bean', 'tray', 'bread', 'cheese', 'sauce', 'tomato', 'grape', 'fruit', 'fork', 'meat', 'rice', 'plate', 'sausage', 'egg', 'dish', 'candy', 'mushroom', 'different', 'nut', 'lemon', 'stem', 'cookie', 'cup', 'handle', 'spoon', 'slice', 'corn', 'onion', 'orange', 'top', 'bunch', 'pea', 'pepper', 'close', 'lunch', 'full', 'green', 'dog', 'butter'] 2022-03-16 06:59:58,706.706 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'food', 'pink', 'fruit', 'plastic', 'apple', 'pile', 'candy', 'container', 'tray', 'lid', 'lime', 'lemon', 'potato', 'grape', 'vegetable', 'nut', 'tomato', 'onion', 'carrot'] 2022-03-16 07:02:22,301.301 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:48:34 iter: 5000 speed: 311.8 images/sec total_norm: 126.9490 (130.2349) loss: 160.4424 (162.6607) masked_loss: 2.0719 (2.0567) tag_loss: 158.1428 (160.6040) time: 1.4343 (1.6421) data: 0.0001 (0.0002) to_device: 0.0049 (0.0047) time_gpu: 1.4293 (1.6372) lr: 0.000092 max mem: 26307 2022-03-16 07:02:22,303.303 2829:checkpoint.py:222 save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt 2022-03-16 07:03:36,062.062 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.39393940567970276 2022-03-16 07:03:36,062.062 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 156.119140625 2022-03-16 07:03:36,062.062 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.31408407173905 2022-03-16 07:03:40,214.214 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.015399402938783169 2022-03-16 07:03:40,214.214 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 07:03:40,215.215 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'of', 'cake', 'left', 'on', 'a', 'plate', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 07:03:40,231.231 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['cake', 'plate', 'table', 'knife', 'handle', '[UNK]', 'blade', 'fork', 'piece', 'chocolate', 'layer', 'shadow', 'slice', 'spoon', 'dessert', 'napkin', 'board', 'white', 'sauce', 'top', 'paper', 'pie', 'cardboard', 'leaf', 'design', 'person', 'whipped', 'cup', 'food', 'box', 'desert', 'crust', 'next', 'container', 'glass', 'wall', 'object', 'bottle', 'cutting', 'tray', 'cream', 'reflection', 'light', 'candle', 'screw', 'floor', 'cloth', 'stem', 'different', 'hole'] 2022-03-16 07:03:56,304.304 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'table', 'piece', 'hole', 'handle', 'plate', 'knife', 'blade', 'cake', 'pizza', 'napkin', 'shovel'] 2022-03-16 07:06:19,340.340 2829:trainer.py:487 do_train_dict(): eta: 1 day, 4:00:42 iter: 5100 speed: 216.0 images/sec total_norm: 128.8725 (130.2941) loss: 159.0382 (159.6953) masked_loss: 2.0793 (2.0698) tag_loss: 156.9288 (157.6255) time: 1.4340 (2.3704) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4290 (1.6315) save_time: 73.3883 (73.3883) lr: 0.000092 max mem: 26307 2022-03-16 07:06:19,701.701 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5714285969734192 2022-03-16 07:06:19,702.702 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 160.65884399414062 2022-03-16 07:06:19,702.702 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.29552195622371 2022-03-16 07:06:23,907.907 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.015500053763389587 2022-03-16 07:06:23,907.907 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 07:06:23,908.908 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'clock', 'set', 'on', '[MASK]', 'of', 'a', 'rhino', '##cer', '##os', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 07:06:23,923.923 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['clock', 'number', 'wall', 'hand', 'face', '[UNK]', 'statue', 'eagle', 'design', 'lion', 'large', 'painting', 'picture', 'bird', 'roman', 'head', 'leaf', 'wing', 'sun', 'hour', 'sword', 'frame', 'building', 'base', 'leg', 'door', 'name', 'gold', 'side', 'handle', 'top', 'old', 'decoration', 'front', 'tree', 'window', 'table', 'big', 'foot', 'background', 'column', 'minute', 'crown', 'emblem', 'word', 'white', 'ornate', 'wood', 'man', 'floor'] 2022-03-16 07:06:39,922.922 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'hand', 'number', 'face', 'top', 'wall', 'base', 'eye', 'foot', 'window', 'leg', 'nose', 'ear', 'sword', 'clock', 'statue', 'bull', 'umbrella'] 2022-03-16 07:09:03,498.498 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:57:54 iter: 5200 speed: 311.9 images/sec total_norm: 125.0973 (126.6906) loss: 168.7177 (167.8608) masked_loss: 2.0960 (2.1171) tag_loss: 166.5221 (165.7437) time: 1.4337 (1.6416) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4285 (1.6364) save_time: 73.3883 (73.3883) lr: 0.000092 max mem: 26307 2022-03-16 07:09:03,860.860 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6000000238418579 2022-03-16 07:09:03,860.860 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 147.057861328125 2022-03-16 07:09:03,860.860 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.30077779517984 2022-03-16 07:09:08,081.081 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.015462061390280724 2022-03-16 07:09:08,081.081 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 07:09:08,081.081 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'man', '[MASK]', 'wine', 'into', 'two', 'other', 'men', '##s', 'wine', 'glasses', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 07:09:08,096.096 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wine', 'glass', 'man', 'hand', 'bottle', 'hair', 'head', 'jacket', 'table', 'shirt', 'button', 'glasses', 'paper', 'woman', 'jean', 'face', '[UNK]', 'ear', 'suit', 'wall', 'arm', 'coat', 'napkin', 'bucket', 'person', 'nose', 'ceiling', 'bowl', 'light', 'label', 'bar', 'watch', 'pot', 'cup', 'window', 'sweater', 'door', 'hat', 'menu', 'sign', 'cap', 'container', 'shelf', 'picture', 'eye', 'pitcher', 'group', 'mouth', 'chair', 'sleeve'] 2022-03-16 07:09:24,027.027 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'other', 'head', 'man', 'name', 'hand', 'face', 'woman', 'cup', 'hair', 'person', 'arm', 'glass', 'ring', 'sign', 'jean', 'newspaper', 'shirt', 'wine', 'bag', 'ear', 'bowl', 'suit', 'tie', 'bottle', 'tag', 'button', 'jacket', 'pen', 'glasses', 'logo', 'barrel', 'purse', 'collar', 'sleeve', 'container', 'bucket'] 2022-03-16 07:11:47,667.667 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:55:06 iter: 5300 speed: 311.9 images/sec total_norm: 127.1204 (130.1645) loss: 165.2025 (164.0027) masked_loss: 1.9677 (2.0347) tag_loss: 163.4390 (161.9680) time: 1.4347 (1.6417) data: 0.0001 (0.0002) to_device: 0.0049 (0.0048) time_gpu: 1.4297 (1.6367) save_time: 73.3883 (73.3883) lr: 0.000092 max mem: 26307 2022-03-16 07:11:48,032.032 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6060606241226196 2022-03-16 07:11:48,032.032 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 137.82122802734375 2022-03-16 07:11:48,032.032 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.40083567301433 2022-03-16 07:11:52,325.325 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.015472730621695518 2022-03-16 07:11:52,326.326 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 07:11:52,326.326 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'young', 'boy', 'abby', 'a', '[MASK]', 'on', 'smiling', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 07:11:52,342.342 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'boy', 'eye', 'ear', 'hair', 'tie', 'nose', 'wall', 'head', 'face', 'neck', 'collar', 'lip', 'mouth', 'eyebrow', 'knot', 'button', 'teeth', 'door', 'young', 'picture', 'background', '[UNK]', 'shadow', 'smile', 'object', 'shoulder', 'child', 'black', 'stripe', 'front', 'chair', 'room', 'man', 'pocket', 'light', 'blue', 'forehead', 'window', 'chin', 'cheek', 'person', 'little', 'handle', 'shelf', 'frame', 'dress', 'arm', 'table', 'logo'] 2022-03-16 07:12:08,487.487 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'face', 'door', 'young', 'hair', 'mouth', 'wall', 'smile', 'boy', 'eye', 'neck', 'shirt', 'nose', 'ear', 'object', 'lip', 'tie', 'collar', 'eyebrow', 'knot'] 03-16 07:12:23.737 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 07:12:23.737 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 07:12:24.790 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 07:14:31,863.863 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:52:19 iter: 5400 speed: 311.8 images/sec total_norm: 127.9301 (130.1617) loss: 165.5096 (165.5691) masked_loss: 2.0282 (2.0227) tag_loss: 163.1503 (163.5464) time: 1.4335 (1.6420) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4284 (1.6368) save_time: 73.3883 (73.3883) lr: 0.000092 max mem: 26307 2022-03-16 07:14:32,225.225 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5757575631141663 2022-03-16 07:14:32,225.225 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 136.86669921875 2022-03-16 07:14:32,226.226 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.37238422740589 2022-03-16 07:14:36,550.550 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.0154603635892272 2022-03-16 07:14:36,550.550 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 07:14:36,551.551 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'an', 'elephant', '[MASK]', 'a', 'shovel', 'with', 'its', 'trunk', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 07:14:36,566.566 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['elephant', 'ground', 'shadow', 'person', 'leg', 'trunk', 'tree', 'ear', 'foot', '[UNK]', 'head', 'man', 'grass', 'shirt', 'fence', 'sky', 'eye', 'pole', 'hat', 'chain', 'sign', 'jacket', 'dirt', 'hose', 'tail', 'stick', 'flag', 'rope', 'hair', 'truck', 'bench', 'hand', 'tire', 'woman', 'crowd', 'cane', 'roof', 'umbrella', 'shoe', 'chair', 'bell', 'seat', 'vehicle', 'jean', 'post', 'toe', 'stand', 'large', 'wheel', 'trailer'] 2022-03-16 07:14:52,627.627 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'building', 'ground', 'post', 'person', 'eye', 'foot', 'tree', 'box', 'sign', 'sky', 'block', 'shirt', 'leg', 'roof', 'ear', 'shadow', 'flag', 'grass', 'trunk', 'fence', 'banner', 'elephant', 'saddle', 'paddle', 'shovel'] 2022-03-16 07:17:16,361.361 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:49:35 iter: 5500 speed: 311.3 images/sec total_norm: 126.0490 (127.5789) loss: 164.1419 (164.4758) masked_loss: 2.1087 (2.0897) tag_loss: 162.0293 (162.3861) time: 1.4336 (1.6450) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4284 (1.6399) save_time: 73.3883 (73.3883) lr: 0.000092 max mem: 26307 2022-03-16 07:17:16,724.724 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.3055555522441864 2022-03-16 07:17:16,724.724 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 156.3013916015625 2022-03-16 07:17:16,725.725 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.40960775102887 2022-03-16 07:17:21,090.090 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.015497853979468346 2022-03-16 07:17:21,090.090 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 07:17:21,091.091 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'an', 'orange', 'flower', 'resting', 'in', '[MASK]', 'oddly', 'shaped', 'vase', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 07:17:21,106.106 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['flower', 'vase', 'stem', 'leaf', '[UNK]', 'table', 'background', 'glass', 'wall', 'handle', 'water', 'rose', 'plant', 'reflection', 'white', 'shadow', 'scissors', 'paper', 'clear', 'blade', 'bottom', 'base', 'red', 'top', 'item', 'light', 'rim', 'orange', 'line', 'small', 'design', 'pink', 'next', 'couple', 'shelf', 'ground', 'yellow', 'mirror', 'purple', 'blue', 'bouquet', 'green', 'hole', 'pair', 'colorful', 'bud', 'branch', 'frame', 'object', 'sky'] 2022-03-16 07:17:37,066.066 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'table', 'rose', 'paper', 'background', 'orange', 'shaped', 'blade', 'flower', 'leaf', 'stem', 'vase'] 2022-03-16 07:20:00,665.665 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:46:49 iter: 5600 speed: 311.6 images/sec total_norm: 125.3712 (127.0534) loss: 161.7146 (163.6505) masked_loss: 2.0062 (2.0626) tag_loss: 160.0377 (161.5879) time: 1.4329 (1.6430) data: 0.0001 (0.0005) to_device: 0.0049 (0.0047) time_gpu: 1.4279 (1.6378) save_time: 73.3883 (73.3883) lr: 0.000092 max mem: 26307 2022-03-16 07:20:01,026.026 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6060606241226196 2022-03-16 07:20:01,026.026 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 161.0405731201172 2022-03-16 07:20:01,027.027 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.44808437949733 2022-03-16 07:20:05,419.419 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.015514500439167023 2022-03-16 07:20:05,419.419 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 07:20:05,420.420 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'woman', 'sitting', 'in', 'front', 'of', 'a', '[MASK]', 'sitting', 'on', 'top', 'of', 'a', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 07:20:05,435.435 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hand', 'ring', 'woman', 'table', 'scarf', 'plate', 'finger', 'nose', 'shirt', 'glass', 'hair', '[UNK]', 'head', 'face', 'mouth', 'eye', 'food', 'arm', 'knife', 'wall', 'fork', 'cup', 'ear', 'person', 'neck', 'chair', 'bowl', 'glasses', 'napkin', 'cake', 'handle', 'dress', 'water', 'bread', 'bracelet', 'window', 'bottle', 'spoon', 'sandwich', 'wrist', 'girl', 'watch', 'meal', 'lid', 'pizza', 'top', 'man', 'paper', 'sleeve', 'wine'] 2022-03-16 07:20:21,395.395 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['hand', 'face', 'top', 'front', 'woman', 'hair', 'girl', 'mouth', 'table', 'eye', 'chair', 'ring', 'finger', 'nose', 'handle', 'plate', 'lip', 'knife', 'fork', 'eyebrow', 'sleeve', 'pizza', 'slice', 'scarf'] 2022-03-16 07:22:44,945.945 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:44:02 iter: 5700 speed: 311.7 images/sec total_norm: 128.5414 (130.6068) loss: 161.8747 (162.9042) masked_loss: 2.0387 (2.0522) tag_loss: 159.6080 (160.8520) time: 1.4328 (1.6428) data: 0.0001 (0.0001) to_device: 0.0050 (0.0050) time_gpu: 1.4277 (1.6376) save_time: 73.3883 (73.3883) lr: 0.000091 max mem: 26307 2022-03-16 07:22:45,309.309 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-16 07:22:45,309.309 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 144.4256591796875 2022-03-16 07:22:45,309.309 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.48733481045427 2022-03-16 07:22:49,720.720 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.015526284463703632 2022-03-16 07:22:49,720.720 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 07:22:49,721.721 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'very', 'large', 'black', 'bear', '[MASK]', '[MASK]', 'the', 'woods', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 07:22:49,736.736 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'bear', 'grass', 'forest', 'trunk', 'head', 'ground', 'ear', 'wood', 'branch', 'flower', 'leg', 'rock', 'field', 'plant', 'bush', '[UNK]', 'black', 'face', 'leaf', 'area', 'log', 'snout', 'brown', 'nose', 'large', 'fur', 'dirt', 'back', 'water', 'stick', 'cub', 'green', 'hill', 'stump', 'path', 'paw', 'grassy', 'next', 'trail', 'weed', 'fern', 'body', 'wooded', 'walking', 'big', 'bird', 'pine', 'standing', 'hillside'] 2022-03-16 07:23:05,672.672 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'black', 'large', 'field', 'forest', 'plant', 'tree', 'wood', 'branch', 'ear', 'bear', 'grass', 'flower', 'trunk', 'foraging'] 2022-03-16 07:25:29,257.257 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:41:17 iter: 5800 speed: 311.6 images/sec total_norm: 127.0965 (131.0839) loss: 161.9664 (163.6607) masked_loss: 2.0147 (2.0309) tag_loss: 159.4530 (161.6298) time: 1.4333 (1.6432) data: 0.0002 (0.0002) to_device: 0.0048 (0.0047) time_gpu: 1.4285 (1.6383) save_time: 73.3883 (73.3883) lr: 0.000091 max mem: 26307 2022-03-16 07:25:29,618.618 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-16 07:25:29,618.618 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 135.27671813964844 2022-03-16 07:25:29,618.618 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.5447828648454 2022-03-16 07:25:34,061.061 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.015603570267558098 2022-03-16 07:25:34,062.062 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 07:25:34,062.062 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'strawberry', 'milk', 'shake', 'and', 'two', 'straw', '[MASK]', 'on', 'a', 'plate', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 07:25:34,077.077 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['table', 'glass', 'strawberry', 'leaf', 'plate', 'stem', 'drink', 'foam', 'base', 'fruit', 'plant', 'top', '[UNK]', 'tray', 'cup', 'shadow', 'flower', 'milk', 'ice', 'container', 'vase', 'wall', 'reflection', 'juice', 'berry', 'napkin', 'white', 'rim', 'window', 'dessert', 'food', 'water', 'liquid', 'coaster', 'banana', 'straw', 'red', 'background', 'light', 'bowl', 'spoon', 'coffee', 'chair', 'next', 'cream', 'handle', 'beverage', 'bottle', 'person', 'jar'] 2022-03-16 07:25:49,972.972 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['top', 'table', 'base', 'glass', 'plant', 'background', 'drink', 'plate', 'shadow', 'shake', 'milk', 'leaf', 'stem', 'rim', 'tray', 'strawberry', 'foam'] 2022-03-16 07:28:13,684.684 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:38:32 iter: 5900 speed: 311.4 images/sec total_norm: 129.1343 (130.4509) loss: 160.3487 (162.0344) masked_loss: 1.9951 (2.0217) tag_loss: 159.0124 (160.0128) time: 1.4331 (1.6442) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4278 (1.6391) save_time: 73.3883 (73.3883) lr: 0.000091 max mem: 26307 2022-03-16 07:28:14,044.044 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.375 2022-03-16 07:28:14,045.045 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 159.78659057617188 2022-03-16 07:28:14,045.045 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.49379030863444 2022-03-16 07:28:18,518.518 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.015681199729442596 2022-03-16 07:28:18,518.518 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 07:28:18,519.519 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'professional', 'snow', 'board', 'athlete', '[MASK]', 'flight', 'on', 'their', 'board', 'rangers', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 07:28:18,534.534 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'sky', 'jacket', 'man', 'helmet', 'glove', 'person', 'arm', 'board', 'building', 'air', 'vest', 'roof', 'hand', 'head', 'letter', 'trick', 'boot', 'coat', 'design', 'logo', 'snow', 'shoe', 'foot', 'face', 'leg', 'hood', 'wall', 'jump', 'sleeve', 'number', 'hat', 'flag', 'top', 'structure', 'yellow', 'ramp', 'stripe', 'shirt', 'fence', 'tree', 'boy', 'sign', 'word', 'writing', 'jean', 'window', 'pole', 'ground', 'wire'] 2022-03-16 07:28:34,596.596 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'face', 'building', 'board', 'professional', 'person', 'arm', 'foot', 'window', 'flight', 'letter', 'sky', 'roof', 'snow', 'coat', 'jacket', 'logo', 'athlete', 'boot', 'helmet', 'shoe', 'glove', 'vest'] 2022-03-16 07:30:58,174.174 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:35:48 iter: 6000 speed: 311.3 images/sec total_norm: 126.8395 (126.0948) loss: 157.8949 (159.6793) masked_loss: 1.9517 (2.0002) tag_loss: 155.9323 (157.6790) time: 1.4335 (1.6449) data: 0.0001 (0.0002) to_device: 0.0049 (0.0048) time_gpu: 1.4285 (1.6400) save_time: 73.3883 (73.3883) lr: 0.000091 max mem: 26307 2022-03-16 07:30:58,536.536 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5555555820465088 2022-03-16 07:30:58,536.536 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 153.12246704101562 2022-03-16 07:30:58,537.537 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.49794131419698 2022-03-16 07:31:03,071.071 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.015682023018598557 2022-03-16 07:31:03,071.071 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 07:31:03,072.072 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'an', '[MASK]', '[MASK]', 'is', 'hanging', 'out', 'by', 'some', 'rocks', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 07:31:03,087.087 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['rock', 'bear', 'head', 'ear', 'nose', 'fur', 'snout', 'plant', 'ground', 'leg', 'face', 'eye', 'wall', 'weed', 'paw', 'mouth', 'leaf', 'boulder', 'grass', 'shadow', 'back', 'black', 'moss', 'water', 'animal', 'stone', 'tree', 'foot', '[UNK]', 'large', 'brown', 'claw', 'zoo', 'tongue', 'log', 'branch', 'arm', 'bush', 'tail', 'rocky', 'snow', 'dirt', 'trunk', 'polar', 'neck', 'next', 'hair', 'flower', 'small', 'enclosure'] 2022-03-16 07:31:19,071.071 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['back', 'head', 'face', 'light', 'ground', 'rock', 'mouth', 'wall', 'eye', 'plant', 'animal', 'tongue', 'ear', 'bear', 'grass', 'tail', 'fur', 'leaf', 'weed'] 2022-03-16 07:33:42,598.598 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:33:03 iter: 6100 speed: 311.4 images/sec total_norm: 124.5278 (127.5351) loss: 159.5699 (161.6323) masked_loss: 1.9772 (1.9913) tag_loss: 157.5625 (159.6409) time: 1.4335 (1.6443) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4284 (1.6392) save_time: 73.3883 (73.3883) lr: 0.000091 max mem: 26307 2022-03-16 07:33:42,958.958 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.65625 2022-03-16 07:33:42,959.959 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 179.55850219726562 2022-03-16 07:33:42,959.959 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.45158533896169 2022-03-16 07:33:47,523.523 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01565631479024887 2022-03-16 07:33:47,523.523 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 07:33:47,524.524 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'is', 'swinging', 'a', 'bat', 'in', 'a', 'grassy', 'field', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 07:33:47,540.540 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['grass', '[UNK]', 'shirt', 'shoe', 'hand', 'leg', 'man', 'head', 'field', 'hair', 'arm', 'person', 'ground', 'boy', 'face', 'tree', 'fence', 'logo', 'park', 'cap', 'sock', 'ear', 'hat', 'pole', 'jean', 'dirt', 'shadow', 'baseball', 'glove', 'woman', 'belt', 'background', 'nose', 'ball', 'young', 'bat', 'glasses', 'short', 'stripe', 'car', 'building', 'watch', 'mouth', 'sleeve', 'jersey', 'girl', 'sunglasses', 'design', 'line', 'bracelet'] 2022-03-16 07:34:03,579.579 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'building', 'field', 'arm', 'window', 'tree', 'shirt', 'leg', 'roof', 'grass', 'hat', 'bat', 'shoe', 'grassy'] 2022-03-16 07:36:27,017.017 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:30:19 iter: 6200 speed: 311.4 images/sec total_norm: 127.5968 (129.2917) loss: 162.7476 (165.0517) masked_loss: 1.9685 (2.0431) tag_loss: 160.4079 (163.0087) time: 1.4332 (1.6442) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4278 (1.6391) save_time: 73.3883 (73.3883) lr: 0.000091 max mem: 26307 2022-03-16 07:36:27,377.377 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6388888955116272 2022-03-16 07:36:27,378.378 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 139.37649536132812 2022-03-16 07:36:27,378.378 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.51522778707837 2022-03-16 07:36:31,995.995 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.015720687806606293 2022-03-16 07:36:31,995.995 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 07:36:31,996.996 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'bathroom', 'with', '[MASK]', 'shower', ',', 'sink', 'and', 'mirror', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 07:36:32,011.011 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['door', 'wall', 'bathroom', 'floor', '[UNK]', 'knob', 'sink', 'tile', 'shower', 'handle', 'mirror', 'light', 'towel', 'ceiling', 'toilet', 'window', 'cabinet', 'head', 'tub', 'rack', 'reflection', 'picture', 'rug', 'drain', 'soap', 'holder', 'switch', 'outlet', 'curtain', 'bottle', 'lid', 'paper', 'doorway', 'seat', 'drawer', 'shelf', 'lamp', 'room', 'frame', 'can', 'glass', 'white', 'hair', 'tank', 'dish', 'vanity', 'cup', 'vent', 'rod', 'box'] 2022-03-16 07:36:47,975.975 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'door', 'light', 'floor', 'wall', 'ring', 'cabinet', 'mirror', 'bathroom', 'ceiling', 'shower', 'pole', 'hallway', 'switch', 'sink', 'rod', 'holder', 'reflection', 'outlet', 'tile', 'tub', 'knob'] 2022-03-16 07:39:11,461.461 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:27:34 iter: 6300 speed: 311.4 images/sec total_norm: 124.5659 (128.2067) loss: 163.1437 (162.3157) masked_loss: 2.0132 (2.0708) tag_loss: 160.6618 (160.2449) time: 1.4330 (1.6445) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4280 (1.6394) save_time: 73.3883 (73.3883) lr: 0.000091 max mem: 26307 2022-03-16 07:39:11,821.821 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5 2022-03-16 07:39:11,822.822 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 145.84521484375 2022-03-16 07:39:11,822.822 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.56801855564117 2022-03-16 07:39:16,474.474 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.0158122256398201 2022-03-16 07:39:16,474.474 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 07:39:16,475.475 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'ceiling', 'fan', 'is', 'turned', '[MASK]', 'in', 'the', 'kitchen', 'of', 'a', 'house', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 07:39:16,490.490 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['kitchen', '[UNK]', 'cabinet', 'ceiling', 'wall', 'handle', 'window', 'light', 'refrigerator', 'door', 'stove', 'drawer', 'oven', 'sink', 'floor', 'fan', 'coffee', 'pot', 'bowl', 'towel', 'kettle', 'bottle', 'microwave', 'cup', 'outlet', 'top', 'tile', 'knob', 'maker', 'pitcher', 'container', 'tea', 'mixer', 'lid', 'paper', 'rack', 'mug', 'basket', 'magnet', 'picture', 'counter', 'knife', 'curtain', 'shelf', 'flower', 'chair', 'hood', 'vase', 'jar', 'can'] 2022-03-16 07:39:32,547.547 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'house', 'top', 'door', 'light', 'cup', 'floor', 'wall', 'window', 'kitchen', 'picture', 'coffee', 'bowl', 'handle', 'cabinet', 'fan', 'bottle', 'ceiling', 'sink', 'pot', 'maker', 'towel', 'drawer', 'tile', 'banana', 'stove', 'knob', 'oven', 'refrigerator', 'mixer'] 2022-03-16 07:41:56,170.170 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:24:52 iter: 6400 speed: 310.9 images/sec total_norm: 125.1325 (127.5553) loss: 159.4396 (160.4304) masked_loss: 1.9751 (2.0432) tag_loss: 157.7386 (158.3872) time: 1.4347 (1.6471) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4297 (1.6420) save_time: 73.3883 (73.3883) lr: 0.000090 max mem: 26307 2022-03-16 07:41:56,530.530 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.529411792755127 2022-03-16 07:41:56,530.530 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 140.12315368652344 2022-03-16 07:41:56,530.530 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.60622863769531 2022-03-16 07:42:01,216.216 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01582055166363716 2022-03-16 07:42:01,216.216 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 07:42:01,217.217 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'man', '[MASK]', 'in', '[MASK]', 'with', 'a', 'a', 'lot', 'of', 'luggage', 'dirt', 'road', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 07:42:01,232.232 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['man', 'sky', 'bag', 'shirt', 'sunglasses', 'short', 'shoe', 'backpack', 'truck', 'leg', 'umbrella', 'shadow', 'head', 'hand', 'watch', 'ground', 'luggage', 'pole', '[UNK]', 'face', 'bench', 'tree', 'hair', 'person', 'jacket', 'road', 'mountain', 'hat', 'arm', 'water', 'wheel', 'cloud', 'tire', 'wall', 'suitcase', 'boat', 'sign', 'grass', 'stripe', 'vehicle', 'light', 'post', 'door', 'window', 'roof', 'group', 'pile', 'strap', 'building', 'dirt'] 2022-03-16 07:42:17,243.243 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'road', 'short', 'hair', 'lot', 'watch', 'sky', 'shirt', 'leg', 'vehicle', 'bag', 'truck', 'wheel', 'sand', 'pole', 'jacket', 'dirt', 'pile', 'shoe', 'cart', 'tire', 'umbrella', 'backpack', 'sunglasses', 'luggage', 'vest'] 03-16 07:42:24.889 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 07:42:24.889 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 07:42:26.246 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 97}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 07:44:40,880.880 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:22:10 iter: 6500 speed: 310.9 images/sec total_norm: 126.5109 (127.8441) loss: 159.2637 (160.3642) masked_loss: 1.9964 (2.0214) tag_loss: 157.2803 (158.3429) time: 1.4337 (1.6471) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4289 (1.6421) save_time: 73.3883 (73.3883) lr: 0.000090 max mem: 26307 2022-03-16 07:44:41,241.241 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5277777910232544 2022-03-16 07:44:41,241.241 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 171.2715606689453 2022-03-16 07:44:41,242.242 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.60344106500798 2022-03-16 07:44:45,981.981 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.015841631218791008 2022-03-16 07:44:45,981.981 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 07:44:45,981.981 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'women', 'cutting', 'a', '[MASK]', 'cake', 'with', 'one', 'lit', 'candle', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 07:44:45,996.996 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['plate', 'wall', 'cake', 'table', 'hair', 'candle', '[UNK]', 'shirt', 'hand', 'head', 'woman', 'man', 'bowl', 'face', 'tie', 'glasses', 'person', 'food', 'light', 'suit', 'napkin', 'jacket', 'ear', 'flower', 'floor', 'stack', 'cup', 'tray', 'knife', 'dress', 'window', 'chair', 'room', 'glass', 'ceiling', 'box', 'curtain', 'pizza', 'display', 'arm', 'dessert', 'nose', 'couple', 'group', 'paper', 'cookie', 'cloth', 'spoon', 'dish', 'coat'] 2022-03-16 07:45:01,998.998 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'hand', 'face', 'large', 'woman', 'hair', 'table', 'wall', 'food', 'chair', 'shirt', 'bowl', 'plate', 'coat', 'knife', 'lit', 'jacket', 'glasses', 'cloth', 'collar', 'cake', 'candle', 'napkin'] 2022-03-16 07:47:25,518.518 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:19:28 iter: 6600 speed: 311.0 images/sec total_norm: 128.4024 (130.6842) loss: 162.3504 (161.6405) masked_loss: 1.9337 (2.0026) tag_loss: 160.5454 (159.6380) time: 1.4338 (1.6464) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4285 (1.6413) save_time: 73.3883 (73.3883) lr: 0.000090 max mem: 26307 2022-03-16 07:47:25,880.880 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5833333134651184 2022-03-16 07:47:25,881.881 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 223.20375061035156 2022-03-16 07:47:25,881.881 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.54124120456069 2022-03-16 07:47:30,661.661 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01588067039847374 2022-03-16 07:47:30,662.662 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 07:47:30,662.662 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'in', '[MASK]', 'red', 'sweatshirt', 'sitting', 'on', 'the', 'floor', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 07:47:30,678.678 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hair', 'floor', 'hand', 'shirt', 'girl', 'tile', 'ponytail', '[UNK]', 'woman', 'sweater', 'head', 'jean', 'controller', 'arm', 'chair', 'wall', 'shoe', 'rug', 'ear', 'table', 'jacket', 'game', 'carpet', 'face', 'leg', 'sweatshirt', 'box', 'remote', 'person', 'nose', 'man', 'book', 'ribbon', 'cord', 'young', 'sock', 'room', 'sleeve', 'wii', 'stand', 'bag', 'boy', 'child', 'glasses', 'band', 'paper', 'glass', 'foot', 'bowl', 'can'] 2022-03-16 07:47:46,615.615 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'red', 'woman', 'board', 'hair', 'girl', 'floor', 'table', 'wall', 'arm', 'chair', 'plant', 'watch', 'jean', 'shirt', 'handle', 'cabinet', 'leaf', 'wrist', 'towel', 'ribbon', 'curtain', 'cord', 'tile', 'sweater', 'magnet', 'oven', 'refrigerator', 'ponytail', 'sweatshirt'] 2022-03-16 07:50:10,424.424 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:16:47 iter: 6700 speed: 310.5 images/sec total_norm: 126.4411 (129.7306) loss: 160.9938 (164.2005) masked_loss: 1.9044 (1.9176) tag_loss: 159.0402 (162.2829) time: 1.4345 (1.6490) data: 0.0002 (0.0005) to_device: 0.0048 (0.0046) time_gpu: 1.4294 (1.6439) save_time: 73.3883 (73.3883) lr: 0.000090 max mem: 26307 2022-03-16 07:50:10,785.785 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4848484992980957 2022-03-16 07:50:10,785.785 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 177.44815063476562 2022-03-16 07:50:10,785.785 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.57696050756118 2022-03-16 07:50:15,556.556 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01590898633003235 2022-03-16 07:50:15,556.556 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 07:50:15,556.556 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'living', 'room', 'filled', '[MASK]', 'furniture', 'and', 'a', '[MASK]', 'flat', 'screen', 'tv', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 07:50:15,572.572 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'room', 'television', 'floor', 'curtain', 'book', 'window', 'picture', 'living', 'ceiling', 'shelf', 'table', '[UNK]', 'couch', 'door', 'stand', 'chair', 'pillow', 'rug', 'coffee', 'screen', 'sofa', 'fireplace', 'toy', 'lamp', 'light', 'bag', 'clock', 'speaker', 'box', 'basket', 'center', 'entertainment', 'blanket', 'doorway', 'frame', 'cabinet', 'paper', 'tv', 'shade', 'remote', 'magazine', 'candle', 'plant', 'cup', 'ottoman', 'cushion', 'dog', 'vase', 'mirror'] 2022-03-16 07:50:31,476.476 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'small', 'room', 'book', 'door', 'center', 'light', 'living', 'television', 'board', 'floor', 'star', 'table', 'wall', 'magazine', 'stand', 'chair', 'paper', 'window', 'box', 'ball', 'sign', 'picture', 'screen', 'entertainment', 'animal', 'coffee', 'painting', 'flat', 'ceiling', 'furniture', 'toy', 'pillow', 'basket', 'curtain', 'shelf', 'laptop', 'fireplace', 'mantle', 'rug'] 2022-03-16 07:52:55,322.322 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:14:06 iter: 6800 speed: 310.5 images/sec total_norm: 125.8012 (128.2865) loss: 161.8088 (163.0958) masked_loss: 1.8929 (1.9011) tag_loss: 159.5959 (161.1946) time: 1.4348 (1.6490) data: 0.0001 (0.0002) to_device: 0.0050 (0.0050) time_gpu: 1.4295 (1.6439) save_time: 73.3883 (73.3883) lr: 0.000090 max mem: 26307 2022-03-16 07:52:55,683.683 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-16 07:52:55,683.683 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 188.1556396484375 2022-03-16 07:52:55,683.683 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.4946517944336 2022-03-16 07:53:00,509.509 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.015924029052257538 2022-03-16 07:53:00,510.510 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 07:53:00,510.510 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', '[MASK]', 'a', 'women', 'who', 'are', '[MASK]', 'by', 'the', 'street', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 07:53:00,525.525 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'shirt', 'woman', 'sky', 'sunglasses', 'hair', 'roof', 'grass', 'window', 'road', 'man', 'building', 'head', 'truck', 'person', 'hand', 'watch', 'pole', 'tire', 'top', 'street', 'face', 'tank', 'wire', 'arm', 'bus', 'line', 'house', 'car', 'strap', 'bag', '[UNK]', 'purse', 'stripe', 'windshield', 'sign', 'wheel', 'necklace', 'curb', 'phone', 'shadow', 'couple', 'writing', 'dress', 'lady', 'wrist', 'number', 'front', 'light', 'logo'] 2022-03-16 07:53:16,478.478 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'house', 'hand', 'number', 'face', 'line', 'building', 'road', 'street', 'woman', 'hair', 'mouth', 'person', 'arm', 'window', 'tree', 'watch', 'sky', 'jean', 'shirt', 'bus', 'roof', 'bag', 'truck', 'shadow', 'grass', 'wire', 'trunk', 'tire', 'necklace', 'backpack', 'curb', 'strap', 'sunglasses'] 2022-03-16 07:55:40,274.274 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:11:26 iter: 6900 speed: 310.4 images/sec total_norm: 126.1319 (128.3255) loss: 157.1677 (158.2721) masked_loss: 1.9626 (1.9616) tag_loss: 155.5383 (156.3105) time: 1.4337 (1.6495) data: 0.0001 (0.0002) to_device: 0.0049 (0.0048) time_gpu: 1.4287 (1.6446) save_time: 73.3883 (73.3883) lr: 0.000090 max mem: 26307 2022-03-16 07:55:40,635.635 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.47058823704719543 2022-03-16 07:55:40,635.635 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 150.71939086914062 2022-03-16 07:55:40,635.635 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.5934708731515 2022-03-16 07:55:45,538.538 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01597350835800171 2022-03-16 07:55:45,538.538 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 07:55:45,538.538 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'some', 'people', 'in', 'a', '[MASK]', 'preparing', '[MASK]', 'near', 'an', 'oven', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 07:55:45,554.554 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', '[UNK]', 'hair', 'man', 'woman', 'wall', 'floor', 'sign', 'person', 'glasses', 'ceiling', 'cart', 'oven', 'head', 'kitchen', 'apron', 'hand', 'tray', 'food', 'door', 'tile', 'shelf', 'handle', 'container', 'light', 'face', 'vent', 'watch', 'arm', 'lady', 'rack', 'box', 'machine', 'sunglasses', 'pan', 'wheel', 'ear', 'shoe', 'bag', 'chef', 'pole', 'hat', 'grill', 'jean', 'plate', 'bowl', 'stove', 'bin', 'table', 'window'] 2022-03-16 07:56:01,532.532 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'door', 'woman', 'hair', 'person', 'floor', 'table', 'wall', 'food', 'box', 'sign', 'machine', 'shirt', 'kitchen', 'handle', 'bottle', 'ceiling', 'glasses', 'shoe', 'cart', 'shelf', 'tray', 'rack', 'oven', 'apron'] 2022-03-16 07:58:25,238.238 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:08:46 iter: 7000 speed: 310.4 images/sec total_norm: 126.6466 (131.0433) loss: 161.2748 (160.4148) masked_loss: 1.8556 (1.9263) tag_loss: 158.1733 (158.4885) time: 1.4348 (1.6496) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4298 (1.6446) save_time: 73.3883 (73.3883) lr: 0.000089 max mem: 26307 2022-03-16 07:58:25,600.600 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6571428775787354 2022-03-16 07:58:25,600.600 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 136.92257690429688 2022-03-16 07:58:25,600.600 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.57982828919317 2022-03-16 07:58:30,510.510 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.016118109226226807 2022-03-16 07:58:30,510.510 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 07:58:30,510.510 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'pizza', 'on', 'a', 'table', '[MASK]', 'a', 'bowl', 'of', 'grapes', '[MASK]', 'drinks', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 07:58:30,526.526 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['table', 'plate', 'glass', 'pizza', 'olive', 'fork', '[UNK]', 'candle', 'water', 'napkin', 'food', 'onion', 'bowl', 'cheese', 'cup', 'grape', 'beer', 'knife', 'paper', 'liquid', 'restaurant', 'pea', 'crust', 'ham', 'menu', 'person', 'hand', 'straw', 'pepper', 'drink', 'coaster', 'bottle', 'handle', 'meat', 'receipt', 'leaf', 'salt', 'spoon', 'slice', 'light', 'white', 'wine', 'shirt', 'chair', 'tomato', 'vegetable', 'bread', 'butter', 'dish', 'logo'] 2022-03-16 07:58:46,540.540 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'water', 'cup', 'table', 'glass', 'bowl', 'plate', 'bottle', 'leaf', 'fork', 'olive', 'pizza', 'candle', 'grape', 'beverage', 'napkin', 'onion'] 2022-03-16 08:01:10,165.165 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:06:05 iter: 7100 speed: 310.4 images/sec total_norm: 124.5199 (126.7206) loss: 159.0376 (159.8691) masked_loss: 1.9311 (1.9475) tag_loss: 157.2568 (157.9216) time: 1.4348 (1.6492) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4296 (1.6442) save_time: 73.3883 (73.3883) lr: 0.000089 max mem: 26307 2022-03-16 08:01:10,528.528 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6285714507102966 2022-03-16 08:01:10,528.528 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 148.12545776367188 2022-03-16 08:01:10,528.528 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.59850809309218 2022-03-16 08:01:15,496.496 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01621018908917904 2022-03-16 08:01:15,496.496 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 08:01:15,497.497 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'baseball', '[MASK]', 'about', 'to', 'swing', '[MASK]', 'a', 'baseball', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 08:01:15,512.512 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'man', '[UNK]', 'shirt', 'sock', 'bat', 'sky', 'baseball', 'belt', 'field', 'shoe', 'hat', 'player', 'leg', 'person', 'grass', 'ball', 'glove', 'head', 'batter', 'hand', 'umpire', 'uniform', 'ground', 'cap', 'park', 'fence', 'arm', 'plate', 'cloud', 'photo', 'shadow', 'helmet', 'black', 'white', 'net', 'building', 'catcher', 'base', 'foot', 'game', 'jersey', 'photograph', 'home', 'couple', 'jacket', 'bench', 'mask', 'ready', 'boot'] 2022-03-16 08:01:31,536.536 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'player', 'field', 'ground', 'person', 'tree', 'baseball', 'ball', 'sky', 'shirt', 'leg', 'grass', 'belt', 'hat', 'cap', 'bat', 'shoe', 'umpire', 'batter', 'sock'] 2022-03-16 08:03:55,159.159 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:03:25 iter: 7200 speed: 310.3 images/sec total_norm: 127.4923 (132.1943) loss: 163.9954 (162.7163) masked_loss: 1.9071 (1.9623) tag_loss: 161.7013 (160.7541) time: 1.4345 (1.6500) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4295 (1.6450) save_time: 73.3883 (73.3883) lr: 0.000089 max mem: 26307 2022-03-16 08:03:55,521.521 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.53125 2022-03-16 08:03:55,522.522 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 142.30935668945312 2022-03-16 08:03:55,522.522 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.64038702559797 2022-03-16 08:04:00,523.523 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.016358964145183563 2022-03-16 08:04:00,523.523 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 08:04:00,523.523 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'an', '[MASK]', 'flying', '[MASK]', 'the', 'city', 'of', 'paris', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 08:04:00,539.539 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'building', 'tail', 'airplane', 'tree', 'city', 'wing', 'window', 'cloud', 'engine', 'logo', 'cockpit', 'nose', 'water', 'wheel', 'door', 'airport', '[UNK]', 'roof', 'tower', 'plane', 'large', 'stripe', 'car', 'letter', 'fuselage', 'air', 'landing', 'bridge', 'road', 'mountain', 'bush', 'house', 'fence', 'background', 'crane', 'grass', 'light', 'jet', 'sign', 'gear', 'horizon', 'skyscraper', 'person', 'front', 'wall', 'top', 'pole', 'flower', 'runway'] 2022-03-16 08:04:16,518.518 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'city', 'building', 'door', 'base', 'hill', 'mountain', 'engine', 'airport', 'distance', 'window', 'wing', 'tree', 'tower', 'sky', 'roof', 'nose', 'tail', 'cloud', 'logo', 'horizon', 'airplane', 'cockpit', 'spire'] 2022-03-16 08:06:40,227.227 2829:trainer.py:487 do_train_dict(): eta: 1 day, 3:00:45 iter: 7300 speed: 310.2 images/sec total_norm: 126.1608 (127.5308) loss: 158.6285 (160.2556) masked_loss: 1.9067 (1.9713) tag_loss: 156.8830 (158.2843) time: 1.4351 (1.6507) data: 0.0001 (0.0002) to_device: 0.0049 (0.0048) time_gpu: 1.4300 (1.6456) save_time: 73.3883 (73.3883) lr: 0.000089 max mem: 26307 2022-03-16 08:06:40,589.589 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.53125 2022-03-16 08:06:40,589.589 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 173.15765380859375 2022-03-16 08:06:40,590.590 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.67391998703415 2022-03-16 08:06:45,646.646 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.016344014555215836 2022-03-16 08:06:45,647.647 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 08:06:45,647.647 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'is', '##landa', 'on', 'the', 'hood', 'of', 'a', 'small', 'car', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 08:06:45,662.662 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['car', 'cat', 'bush', 'windshield', '[UNK]', 'window', 'grass', 'tree', 'mirror', 'hood', 'building', 'plant', 'road', 'reflection', 'ear', 'sky', 'door', 'ground', 'curb', 'pole', 'light', 'head', 'tire', 'tail', 'photo', 'wheel', 'fence', 'roof', 'house', 'white', 'paw', 'sidewalk', 'handle', 'leaf', 'wall', 'license', 'shadow', 'weed', 'trunk', 'black', 'line', 'bumper', 'truck', 'sign', 'logo', 'dog', 'flower', 'leg', 'animal', 'street'] 2022-03-16 08:07:01,748.748 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'small', 'building', 'light', 'car', 'wall', 'plant', 'tree', 'clothes', 'cat', 'plate', 'bush', 'license', 'photo', 'flower', 'leaf', 'hood', 'logo', 'cloth', 'fence', 'sidewalk', 'windshield'] 2022-03-16 08:09:25,349.349 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:58:05 iter: 7400 speed: 310.1 images/sec total_norm: 127.8721 (131.9460) loss: 162.8023 (162.5743) masked_loss: 1.9534 (1.9888) tag_loss: 160.7738 (160.5855) time: 1.4335 (1.6512) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4282 (1.6461) save_time: 73.3883 (73.3883) lr: 0.000089 max mem: 26307 2022-03-16 08:09:25,710.710 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4848484992980957 2022-03-16 08:09:25,710.710 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 126.70350646972656 2022-03-16 08:09:25,710.710 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.70002421061199 2022-03-16 08:09:30,979.979 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.016315054148435593 2022-03-16 08:09:30,979.979 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 08:09:30,980.980 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'women', '[MASK]', 'in', 'a', 'room', 'holding', 'wine', 'glasses', ',', '[MASK]', 'other', 'people', 'behind', 'them', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 08:09:30,995.995 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'glass', 'hair', 'woman', 'man', 'person', 'hand', 'wine', 'wall', 'scarf', 'glasses', 'paper', 'ceiling', 'face', 'light', 'window', 'head', 'room', 'jacket', '[UNK]', 'jean', 'watch', 'group', 'napkin', 'purse', 'book', 'table', 'bag', 'sweater', 'pillar', 'arm', 'picture', 'bottle', 'column', 'floor', 'lady', 'neck', 'ring', 'menu', 'door', 'strap', 'folder', 'necklace', 'staircase', 'ear', 'beard', 'smile', 'short', 'chair', 'nose'] 2022-03-16 08:09:47,003.003 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'other', 'head', 'man', 'hand', 'face', 'room', 'light', 'woman', 'hair', 'girl', 'person', 'floor', 'table', 'wall', 'phone', 'glass', 'paper', 'window', 'cell', 'ring', 'shirt', 'wine', 'bag', 'coat', 'ceiling', 'jacket', 'glasses', 'necklace', 'poster', 'candle', 'scarf', 'folder'] 2022-03-16 08:12:10,561.561 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:55:26 iter: 7500 speed: 309.9 images/sec total_norm: 124.2984 (126.4991) loss: 162.5713 (162.9374) masked_loss: 1.8916 (1.8848) tag_loss: 160.5367 (161.0526) time: 1.4337 (1.6521) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4285 (1.6470) save_time: 73.3883 (73.3883) lr: 0.000089 max mem: 26307 2022-03-16 08:12:10,922.922 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5 2022-03-16 08:12:10,922.922 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 146.9164581298828 2022-03-16 08:12:10,922.922 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.75977787218596 2022-03-16 08:12:16,090.090 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.0163092240691185 2022-03-16 08:12:16,090.090 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 08:12:16,090.090 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'dog', 'that', 'is', 'looking', 'at', 'protected', 'herd', 'of', 'sheep', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 08:12:16,105.105 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sheep', 'head', 'grass', 'hill', 'field', 'ear', 'rock', 'herd', 'face', 'leg', 'tail', 'group', '[UNK]', 'mountain', 'hillside', 'ground', 'sky', 'lamb', 'animal', 'goat', 'wool', 'tree', 'bush', 'eye', 'green', 'nose', 'grassy', 'tag', 'horn', 'fence', 'large', 'other', 'grazing', 'water', 'flock', 'person', 'road', 'bunch', 'open', 'rocky', 'snow', 'white', 'number', 'next', 'pasture', 'dirt', 'gravel', 'landscape', 'background', 'dog'] 03-16 08:12:26.260 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 08:12:26.260 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 08:12:26.925 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 10}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 10}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 9}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 13}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 13}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}] 2022-03-16 08:12:32,117.117 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'face', 'field', 'ground', 'rock', 'hill', 'dog', 'ear', 'grass', 'tail', 'sheep', 'herd'] 2022-03-16 08:14:55,723.723 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:52:47 iter: 7600 speed: 310.0 images/sec total_norm: 131.4668 (133.6689) loss: 156.9714 (157.9888) masked_loss: 1.8445 (1.9087) tag_loss: 154.9431 (156.0801) time: 1.4343 (1.6516) data: 0.0001 (0.0002) to_device: 0.0049 (0.0047) time_gpu: 1.4292 (1.6467) save_time: 73.3883 (73.3883) lr: 0.000089 max mem: 26307 2022-03-16 08:14:56,084.084 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.529411792755127 2022-03-16 08:14:56,085.085 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 169.597412109375 2022-03-16 08:14:56,085.085 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.76016156085126 2022-03-16 08:15:01,272.272 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.016325822100043297 2022-03-16 08:15:01,272.272 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 08:15:01,273.273 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'holding', 'a', 'fr', '##is', '##bee', '[MASK]', 'to', 'a', 'boy', 'in', 'front', 'of', 'him', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 08:15:01,288.288 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'man', 'short', 'sky', 'tree', 'grass', 'head', 'hair', 'logo', 'field', 'hat', '[UNK]', 'hand', 'fence', 'face', 'cap', 'sunglasses', 'leg', 'stripe', 'arm', 'ear', 'line', 'glove', 'watch', 'glasses', 'sock', 'park', 'pole', 'shoe', 'wire', 'bush', 'boy', 'cloud', 'jersey', 'post', 'house', 'game', 'person', 'uniform', 'ground', 'cone', 'design', 'young', 'grassy', 'nose', 'knee', 'number', 'band', 'background', 'sleeve'] 2022-03-16 08:15:17,254.254 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'line', 'power', 'front', 'short', 'field', 'hair', 'post', 'person', 'arm', 'boy', 'tree', 'watch', 'sky', 'shirt', 'leg', 'nose', 'ear', 'grass', 'hat', 'cap', 'wrist', 'logo', 'fence', 'glove', 'stripe'] 2022-03-16 08:17:40,798.798 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:50:07 iter: 7700 speed: 310.2 images/sec total_norm: 126.8387 (131.1606) loss: 160.7193 (161.1674) masked_loss: 2.0412 (1.9750) tag_loss: 158.6781 (159.1924) time: 1.4340 (1.6508) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4289 (1.6457) save_time: 73.3883 (73.3883) lr: 0.000088 max mem: 26307 2022-03-16 08:17:41,159.159 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5625 2022-03-16 08:17:41,160.160 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 146.6061248779297 2022-03-16 08:17:41,160.160 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.83692081157977 2022-03-16 08:17:46,362.362 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.016308940947055817 2022-03-16 08:17:46,362.362 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 08:17:46,362.362 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'young', '[MASK]', 'on', 'a', 'skate', '##board', 'doing', 'tricks', 'on', 'the', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 08:17:46,378.378 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['building', 'grass', 'hand', 'window', 'man', 'arm', 'head', '[UNK]', 'shirt', 'sky', 'tree', 'sidewalk', 'hat', 'hair', 'light', 'person', 'ground', 'road', 'pole', 'boy', 'jean', 'sweater', 'car', 'jacket', 'street', 'house', 'shoe', 'roof', 'sign', 'leg', 'shadow', 'wall', 'park', 'face', 'bush', 'sweatshirt', 'cap', 'door', 'rock', 'city', 'curb', 'line', 'fence', 'coat', 'cloud', 'wheel', 'woman', 'glove', 'sleeve', 'balcony'] 2022-03-16 08:18:02,365.365 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'building', 'park', 'young', 'light', 'ground', 'rock', 'arm', 'boy', 'window', 'tree', 'sky', 'jean', 'leg', 'bell', 'shadow', 'wheel', 'grass', 'hat', 'pole', 'lamp', 'shoe', 'cement', 'sidewalk', 'boulder', 'sweater'] 2022-03-16 08:20:26,093.093 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:47:28 iter: 7800 speed: 309.8 images/sec total_norm: 126.0564 (129.4153) loss: 162.3647 (163.3301) masked_loss: 1.9442 (1.9351) tag_loss: 160.4022 (161.3950) time: 1.4337 (1.6529) data: 0.0001 (0.0005) to_device: 0.0051 (0.0049) time_gpu: 1.4284 (1.6475) save_time: 73.3883 (73.3883) lr: 0.000088 max mem: 26307 2022-03-16 08:20:26,455.455 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6363636255264282 2022-03-16 08:20:26,456.456 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 142.39051818847656 2022-03-16 08:20:26,456.456 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.87405501739889 2022-03-16 08:20:31,738.738 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.016296187415719032 2022-03-16 08:20:31,739.739 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 08:20:31,739.739 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'couple', 'of', 'cats', 'that', '[MASK]', 'sitting', 'on', '[MASK]', 'fence', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 08:20:31,754.754 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['cat', 'ear', 'tail', 'head', 'bowl', 'grass', 'fence', 'leg', 'paw', 'wood', 'body', 'wall', 'plant', 'eye', 'leaf', 'post', '[UNK]', 'bench', 'tree', 'board', 'wooden', 'ledge', 'fur', 'neck', 'back', 'top', 'table', 'light', 'pot', 'window', 'flower', 'black', 'trunk', 'nose', 'sun', 'bush', 'collar', 'brown', 'water', 'large', 'small', 'branch', 'white', 'animal', 'container', 'weed', 'face', 'field', 'bucket', 'next'] 2022-03-16 08:20:47,827.827 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'post', 'wall', 'couple', 'nose', 'ear', 'bowl', 'cat', 'grass', 'tail', 'flower', 'fence', 'ledge', 'paw'] 2022-03-16 08:23:11,304.304 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:44:48 iter: 7900 speed: 309.9 images/sec total_norm: 123.3705 (125.8484) loss: 159.1401 (160.0956) masked_loss: 1.9052 (1.9651) tag_loss: 157.3639 (158.1305) time: 1.4332 (1.6522) data: 0.0001 (0.0002) to_device: 0.0049 (0.0048) time_gpu: 1.4280 (1.6472) save_time: 73.3883 (73.3883) lr: 0.000088 max mem: 26307 2022-03-16 08:23:11,665.665 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4848484992980957 2022-03-16 08:23:11,665.665 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 176.79196166992188 2022-03-16 08:23:11,665.665 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.89437084197998 2022-03-16 08:23:16,964.964 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01640871912240982 2022-03-16 08:23:16,964.964 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 08:23:16,965.965 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'woman', '[MASK]', 'an', 'umbrella', 'over', 'another', 'man', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 08:23:16,980.980 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'man', 'hand', 'head', 'building', 'hair', 'face', 'nose', 'arm', 'wall', 'ear', 'shirt', '[UNK]', 'mouth', 'eye', 'jacket', 'phone', 'glasses', 'person', 'window', 'sleeve', 'bush', 'cell', 'collar', 'woman', 'hat', 'beard', 'pole', 'door', 'house', 'table', 'finger', 'watch', 'brick', 'fence', 'bench', 'sweater', 'cap', 'sunglasses', 'sign', 'jean', 'handle', 'logo', 'camera', 'trunk', 'chair', 'boy', 'roof', 'neck', 'grass'] 2022-03-16 08:23:32,994.994 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'hand', 'face', 'woman', 'hair', 'person', 'arm', 'neck', 'foot', 'tree', 'shirt', 'nose', 'suit', 'coat', 'hat', 'button', 'jacket', 'glasses', 'umbrella'] 2022-03-16 08:25:56,533.533 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:42:08 iter: 8000 speed: 309.9 images/sec total_norm: 128.0208 (127.7262) loss: 162.2748 (163.2440) masked_loss: 1.9377 (1.9987) tag_loss: 160.7706 (161.2453) time: 1.4342 (1.6523) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4290 (1.6471) save_time: 73.3883 (73.3883) lr: 0.000088 max mem: 26307 2022-03-16 08:25:56,894.894 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4545454680919647 2022-03-16 08:25:56,894.894 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 144.53936767578125 2022-03-16 08:25:56,895.895 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.90647916440611 2022-03-16 08:26:02,244.244 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.0164373479783535 2022-03-16 08:26:02,244.244 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 08:26:02,245.245 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'team', 'of', 'baseball', 'players', 'playing', 'a', 'game', 'of', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 08:26:02,260.260 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['field', '[UNK]', 'stand', 'player', 'shirt', 'man', 'helmet', 'umpire', 'person', 'catcher', 'uniform', 'bat', 'batter', 'line', 'crowd', 'grass', 'shoe', 'baseball', 'wall', 'stadium', 'fence', 'chair', 'jersey', 'plate', 'dirt', 'hat', 'home', 'sign', 'game', 'head', 'glove', 'pitcher', 'spectator', 'number', 'cap', 'net', 'hedge', 'mound', 'leg', 'stair', 'belt', 'mask', 'ball', 'hand', 'base', 'pitchers', 'group', 'pole', 'back', 'sock'] 2022-03-16 08:26:18,261.261 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'team', 'man', 'home', 'game', 'line', 'player', 'field', 'person', 'wall', 'base', 'stand', 'stadium', 'baseball', 'sign', 'shirt', 'jersey', 'leg', 'crowd', 'plate', 'grass', 'hat', 'uniform', 'dirt', 'bat', 'logo', 'fence', 'helmet', 'shoe', 'catcher', 'glove', 'hedge', 'umpire', 'spectator', 'batter'] 2022-03-16 08:28:41,675.675 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:39:28 iter: 8100 speed: 310.0 images/sec total_norm: 130.1164 (130.8522) loss: 162.8019 (160.7357) masked_loss: 1.8161 (1.8873) tag_loss: 160.8265 (158.8485) time: 1.4323 (1.6514) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4271 (1.6463) save_time: 73.3883 (73.3883) lr: 0.000088 max mem: 26307 2022-03-16 08:28:42,039.039 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5555555820465088 2022-03-16 08:28:42,040.040 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 153.68116760253906 2022-03-16 08:28:42,040.040 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.92798046949433 2022-03-16 08:28:47,429.429 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01645222119987011 2022-03-16 08:28:47,429.429 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 08:28:47,429.429 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'before', '[MASK]', '##nished', '[MASK]', 'panels', 'is', 'a', 'young', 'man', 'in', 'a', 'gray', 'striped', 'dress', 'shirt', ',', '[MASK]', 'tie', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 08:28:47,445.445 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'man', 'wall', 'hair', 'head', '[UNK]', 'face', 'nose', 'hand', 'ear', 'door', 'eye', 'tie', 'mouth', 'arm', 'collar', 'glasses', 'cabinet', 'belt', 'floor', 'suit', 'picture', 'room', 'chair', 'knob', 'table', 'woman', 'ceiling', 'shelf', 'bottle', 'light', 'window', 'handle', 'beard', 'person', 'watch', 'jean', 'jacket', 'glass', 'phone', 'neck', 'paper', 'mustache', 'button', 'cup', 'box', 'mirror', 'sign', 'sleeve', 'leg'] 2022-03-16 08:29:03,462.462 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'face', 'black', 'door', 'young', 'dark', 'hair', 'mouth', 'wall', 'eye', 'wood', 'shirt', 'gray', 'dress', 'nose', 'ear', 'handle', 'tie', 'belt', 'knob'] 2022-03-16 08:31:26,838.838 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:36:48 iter: 8200 speed: 310.0 images/sec total_norm: 129.3882 (130.4708) loss: 158.9577 (161.8745) masked_loss: 1.9345 (1.9505) tag_loss: 156.9890 (159.9239) time: 1.4317 (1.6517) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4269 (1.6467) save_time: 73.3883 (73.3883) lr: 0.000088 max mem: 26307 2022-03-16 08:31:27,201.201 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5454545617103577 2022-03-16 08:31:27,202.202 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 149.48638916015625 2022-03-16 08:31:27,202.202 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.98289324289345 2022-03-16 08:31:32,588.588 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.016465699300169945 2022-03-16 08:31:32,589.589 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 08:31:32,589.589 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'man', 'taking', 'a', 'picture', 'of', 'himself', 'with', 'something', 'in', 'his', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 08:31:32,604.604 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'door', 'hair', 'man', 'hand', '[UNK]', 'glasses', 'arm', 'phone', 'face', 'head', 'window', 'wall', 'cell', 'ear', 'pillar', 'room', 'tree', 'ceiling', 'handle', 'camera', 'column', 'design', 'floor', 'knob', 'jean', 'bush', 'picture', 'nose', 'light', 'sleeve', 'short', 'elbow', 'blind', 'leg', 'house', 'logo', 'screen', 'cabinet', 'switch', 'tile', 'chair', 'plant', 'eye', 'outside', 'young', 'post', 'doorway', 'mouth', 'person'] 2022-03-16 08:31:48,637.637 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'something', 'door', 'short', 'hair', 'mouth', 'wall', 'arm', 'phone', 'window', 'image', 'shirt', 'picture', 'ear', 'handle', 'bush', 'glasses', 'elbow', 'closet', 'pillow', 'bicycle', 'towel', 'straw', 'pillar', 'knob'] 2022-03-16 08:34:12,189.189 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:34:09 iter: 8300 speed: 309.6 images/sec total_norm: 128.5603 (131.6583) loss: 157.4153 (158.8804) masked_loss: 1.8459 (1.8920) tag_loss: 155.8786 (156.9883) time: 1.4335 (1.6535) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4284 (1.6484) save_time: 73.3883 (73.3883) lr: 0.000088 max mem: 26307 2022-03-16 08:34:12,552.552 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.625 2022-03-16 08:34:12,552.552 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 182.23329162597656 2022-03-16 08:34:12,552.552 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.98092814854213 2022-03-16 08:34:18,007.007 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.016450172290205956 2022-03-16 08:34:18,007.007 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 08:34:18,008.008 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'many', 'different', 'clocks', 'and', 'different', 'time', 'zones', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 08:34:18,023.023 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['clock', 'hand', 'number', 'letter', 'wall', 'word', 'face', 'shadow', '[UNK]', 'reflection', 'writing', 'ceiling', 'light', 'floor', 'display', 'handle', 'sign', 'logo', 'base', 'stand', 'man', 'window', 'person', 'white', 'tile', 'table', 'mirror', 'top', 'paper', 'circle', 'box', 'shirt', 'line', 'door', 'head', 'car', 'cord', 'large', 'frame', 'woman', 'arrow', 'arm', 'lettering', 'room', 'platform', 'design', 'hair', 'leg', 'pole', 'name'] 2022-03-16 08:34:34,076.076 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['time', 'many', 'name', 'hand', 'number', 'face', 'different', 'word', 'wall', 'letter', 'clock'] 2022-03-16 08:36:57,548.548 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:31:29 iter: 8400 speed: 309.6 images/sec total_norm: 129.6379 (134.4788) loss: 157.8736 (160.3304) masked_loss: 1.8816 (1.9116) tag_loss: 155.9777 (158.4188) time: 1.4333 (1.6536) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4282 (1.6485) save_time: 73.3883 (73.3883) lr: 0.000087 max mem: 26307 2022-03-16 08:36:57,912.912 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6470588445663452 2022-03-16 08:36:57,912.912 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 173.57479858398438 2022-03-16 08:36:57,912.912 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.94503371294807 2022-03-16 08:37:03,405.405 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01646980084478855 2022-03-16 08:37:03,405.405 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 08:37:03,405.405 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'group', '[MASK]', 'young', 'men', 'playing', 'a', 'game', 'of', 'soccer', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 08:37:03,421.421 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['man', 'shirt', 'tree', 'short', 'hair', 'sock', 'shoe', 'ball', 'ground', 'grass', 'hand', 'head', '[UNK]', 'tank', 'pole', 'leg', 'vest', 'trunk', 'arm', 'park', 'soccer', 'sidewalk', 'person', 'top', 'stripe', 'leaf', 'boy', 'face', 'jean', 'fence', 'bottle', 'field', 'bag', 'guy', 'background', 'bench', 'jacket', 'car', 'foot', 'rock', 'phone', 'hat', 'dirt', 'couple', 'trash', 'tire', 'ear', 'beard', 'male', 'cup'] 2022-03-16 08:37:19,392.392 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'group', 'hand', 'game', 'top', 'young', 'cup', 'short', 'ground', 'hair', 'person', 'arm', 'tree', 'ball', 'jean', 'shirt', 'leg', 'soccer', 'tank', 'grass', 'pole', 'helmet', 'shoe', 'vest', 'sock'] 2022-03-16 08:39:43,055.055 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:28:51 iter: 8500 speed: 309.4 images/sec total_norm: 127.0314 (128.9566) loss: 159.1973 (158.7116) masked_loss: 1.8547 (1.8785) tag_loss: 157.5667 (156.8330) time: 1.4340 (1.6551) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4289 (1.6499) save_time: 73.3883 (73.3883) lr: 0.000087 max mem: 26307 2022-03-16 08:39:43,417.417 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.53125 2022-03-16 08:39:43,418.418 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 133.6230010986328 2022-03-16 08:39:43,418.418 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 68.9906919612441 2022-03-16 08:39:48,967.967 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.016516124829649925 2022-03-16 08:39:48,967.967 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 08:39:48,968.968 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'multiple', 'clouds', 'in', 'a', 'field', 'on', 'a', 'cloudy', 'day', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 08:39:48,983.983 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['grass', 'cow', 'tree', 'head', 'sky', 'shadow', 'leg', 'ear', 'field', 'face', 'trunk', 'nose', 'tail', 'leaf', 'cloud', 'horn', '[UNK]', 'green', 'animal', 'group', 'water', 'cattle', 'herd', 'pasture', 'ground', 'grassy', 'bull', 'spot', 'post', 'rock', 'sheep', 'eye', 'plant', 'background', 'stick', 'fence', 'calf', 'mouth', 'brown', 'flower', 'lush', 'bush', 'tag', 'grazing', 'white', 'horizon', 'hill', 'distance', 'pole', 'dog'] 2022-03-16 08:40:04,993.993 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'day', 'face', 'field', 'mouth', 'eye', 'tree', 'sky', 'leg', 'nose', 'ear', 'shadow', 'grass', 'tail', 'cloud', 'horn', 'trunk', 'cow', 'cloudy'] 03-16 08:42:27.025 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 08:42:27.025 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 08:42:28.277 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 95}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 94}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 95}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}] 2022-03-16 08:42:28,526.526 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:26:12 iter: 8600 speed: 309.4 images/sec total_norm: 126.1418 (129.9278) loss: 156.5067 (159.4967) masked_loss: 1.9182 (1.9152) tag_loss: 154.5516 (157.5815) time: 1.4339 (1.6546) data: 0.0001 (0.0001) to_device: 0.0052 (0.0051) time_gpu: 1.4288 (1.6494) save_time: 73.3883 (73.3883) lr: 0.000087 max mem: 26307 2022-03-16 08:42:28,887.887 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.529411792755127 2022-03-16 08:42:28,888.888 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 137.04002380371094 2022-03-16 08:42:28,888.888 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.07255326194324 2022-03-16 08:42:34,486.486 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.016567859798669815 2022-03-16 08:42:34,487.487 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 08:42:34,487.487 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'black', 'cat', '[MASK]', 'to', '[MASK]', 'photograph', 'of', 'a', 'cat', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 08:42:34,502.502 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['ear', 'cat', 'wall', 'head', 'frame', 'picture', 'eye', 'tail', 'carpet', '[UNK]', 'mirror', 'black', 'floor', 'face', 'nose', 'room', 'table', 'man', 'reflection', 'photo', 'leg', 'shadow', 'window', 'paw', 'door', 'white', 'light', 'body', 'fur', 'photograph', 'painting', 'curtain', 'front', 'neck', 'next', 'woman', 'couple', 'animal', 'dog', 'book', 'ceiling', 'shelf', 'collar', 'rug', 'ground', 'mouth', 'flower', 'handle', 'person', 'top'] 2022-03-16 08:42:50,487.487 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'room', 'black', 'floor', 'wall', 'eye', 'picture', 'dog', 'animal', 'ear', 'frame', 'cat', 'shadow', 'tail', 'photograph', 'carpet'] 2022-03-16 08:45:14,105.105 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:23:34 iter: 8700 speed: 309.2 images/sec total_norm: 127.0476 (128.9059) loss: 158.8764 (161.1638) masked_loss: 1.7965 (1.8851) tag_loss: 157.1656 (159.2787) time: 1.4347 (1.6559) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4295 (1.6507) save_time: 73.3883 (73.3883) lr: 0.000087 max mem: 26307 2022-03-16 08:45:14,466.466 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5 2022-03-16 08:45:14,466.466 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 191.82302856445312 2022-03-16 08:45:14,466.466 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.0547200983221 2022-03-16 08:45:20,049.049 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.016604948788881302 2022-03-16 08:45:20,049.049 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 08:45:20,049.049 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', '[MASK]', 'a', '[MASK]', '##d', 'in', 'enclosure', 'eats', 'from', 'the', 'ground', 'beneath', 'a', 'tree', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 08:45:20,065.065 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['ground', 'zebra', 'leg', 'grass', 'tail', 'pole', 'shadow', 'tree', 'head', 'fence', 'stripe', 'trunk', 'dirt', '[UNK]', 'mane', 'ear', 'hay', 'post', 'neck', 'wire', 'rock', 'enclosure', 'nose', 'zoo', 'bush', 'stick', 'pen', 'feeder', 'branch', 'straw', 'log', 'mouth', 'leaf', 'mesh', 'rope', 'area', 'next', 'stump', 'building', 'animal', 'road', 'shade', 'wall', 'hair', 'basket', 'trough', 'other', 'field', 'foot', 'hose'] 2022-03-16 08:45:36,052.052 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'ground', 'post', 'neck', 'tree', 'wood', 'branch', 'leg', 'shadow', 'grass', 'tail', 'dirt', 'wire', 'rope', 'trunk', 'fence', 'log', 'enclosure', 'stripe', 'mesh', 'mane', 'netting', 'zebra'] 2022-03-16 08:47:59,615.615 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:20:55 iter: 8800 speed: 309.3 images/sec total_norm: 128.2200 (128.9056) loss: 161.5915 (159.8988) masked_loss: 1.8907 (1.9220) tag_loss: 159.7148 (157.9769) time: 1.4329 (1.6551) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4281 (1.6501) save_time: 73.3883 (73.3883) lr: 0.000087 max mem: 26307 2022-03-16 08:47:59,977.977 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5454545617103577 2022-03-16 08:47:59,977.977 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 166.46978759765625 2022-03-16 08:47:59,977.977 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.0237956529253 2022-03-16 08:48:05,633.633 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01659729890525341 2022-03-16 08:48:05,633.633 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 08:48:05,634.634 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'person', '[MASK]', 'a', 'motorcycle', 'coming', 'up', 'to', 'a', 'stop', 'sign', '[MASK]', 'the', 'street', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 08:48:05,649.649 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'road', 'tire', 'pole', 'tree', 'car', 'street', 'motorcycle', 'light', 'line', 'man', '[UNK]', 'helmet', 'wire', 'building', 'person', 'traffic', 'sidewalk', 'sign', 'curb', 'window', 'bike', 'wall', 'jacket', 'wheel', 'shadow', 'bush', 'lot', 'intersection', 'power', 'shirt', 'roof', 'jean', 'truck', 'house', 'suv', 'windshield', 'van', 'mirror', 'parking', 'fence', 'license', 'plate', 'grass', 'hat', 'tail', 'arrow', 'bus', 'stop', 'median'] 2022-03-16 08:48:21,668.668 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'line', 'road', 'power', 'street', 'light', 'car', 'stop', 'person', 'tree', 'sign', 'sky', 'truck', 'shadow', 'grass', 'bush', 'pole', 'jacket', 'motorcycle', 'tire'] 2022-03-16 08:50:45,198.198 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:18:17 iter: 8900 speed: 309.2 images/sec total_norm: 129.5278 (134.4210) loss: 159.9204 (161.6548) masked_loss: 1.8063 (1.8724) tag_loss: 157.4885 (159.7824) time: 1.4322 (1.6559) data: 0.0001 (0.0005) to_device: 0.0049 (0.0047) time_gpu: 1.4270 (1.6506) save_time: 73.3883 (73.3883) lr: 0.000087 max mem: 26307 2022-03-16 08:50:45,558.558 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5428571701049805 2022-03-16 08:50:45,558.558 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 172.56634521484375 2022-03-16 08:50:45,558.558 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.04009874131944 2022-03-16 08:50:51,254.254 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.016634687781333923 2022-03-16 08:50:51,254.254 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 08:50:51,254.254 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'large', 'elephant', 'walking', 'in', 'the', '[MASK]', 'alone', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 08:50:51,270.270 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'ear', 'trunk', 'elephant', 'ground', 'head', 'leg', 'eye', 'pole', 'foot', 'fence', 'tail', 'sand', 'mouth', 'dirt', 'back', '[UNK]', 'wall', 'building', 'enclosure', 'rock', 'zoo', 'person', 'shadow', 'road', 'man', 'background', 'bush', 'water', 'face', 'structure', 'shirt', 'couple', 'sky', 'hair', 'post', 'grass', 'leaf', 'baby', 'roof', 'stick', 'woman', 'forest', 'light', 'toe', 'sign', 'hill', 'short', 'next', 'pen'] 2022-03-16 08:51:07,241.241 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'large', 'ground', 'eye', 'foot', 'tree', 'walking', 'leg', 'ear', 'sand', 'grass', 'pole', 'dirt', 'trunk', 'fence', 'elephant'] 2022-03-16 08:53:30,811.811 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:15:38 iter: 9000 speed: 309.2 images/sec total_norm: 126.0652 (130.1862) loss: 159.0853 (160.9368) masked_loss: 1.8702 (1.9007) tag_loss: 157.0059 (159.0362) time: 1.4334 (1.6561) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4284 (1.6509) save_time: 73.3883 (73.3883) lr: 0.000086 max mem: 26307 2022-03-16 08:53:31,171.171 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4545454680919647 2022-03-16 08:53:31,172.172 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 166.279541015625 2022-03-16 08:53:31,172.172 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.02513851962247 2022-03-16 08:53:36,961.961 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01662563905119896 2022-03-16 08:53:36,961.961 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 08:53:36,962.962 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'is', 'getting', 'his', 'hair', '[MASK]', 'by', 'a', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 08:53:36,977.977 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hand', 'head', 'shirt', 'man', 'hair', 'face', 'nose', '[UNK]', 'ear', 'arm', 'person', 'wall', 'hat', 'eye', 'knife', 'building', 'woman', 'window', 'mouth', 'handle', 'sign', 'table', 'plate', 'bracelet', 'collar', 'food', 'door', 'pole', 'glasses', 'bag', 'container', 'watch', 'mustache', 'scissors', 'dress', 'ground', 'boy', 'cap', 'tree', 'chair', 'fork', 'bowl', 'sky', 'jacket', 'button', 'ceiling', 'bus', 'paper', 'pan', 'umbrella'] 2022-03-16 08:53:52,926.926 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'hand', 'face', 'light', 'hair', 'person', 'table', 'wall', 'cut', 'chair', 'window', 'box', 'sign', 'shirt', 'ear', 'bowl', 'clock', 'mirror', 'knife', 'paint', 'cloth', 'pipe', 'beard', 'towel', 'robe', 'poster', 'barber', 'apron'] 2022-03-16 08:56:16,340.340 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:12:59 iter: 9100 speed: 309.3 images/sec total_norm: 130.0455 (136.3868) loss: 159.1187 (160.4913) masked_loss: 1.8685 (1.8612) tag_loss: 157.7375 (158.6301) time: 1.4323 (1.6553) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4272 (1.6501) save_time: 73.3883 (73.3883) lr: 0.000086 max mem: 26307 2022-03-16 08:56:16,703.703 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6666666865348816 2022-03-16 08:56:16,703.703 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 130.28884887695312 2022-03-16 08:56:16,703.703 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.05082578244416 2022-03-16 08:56:22,474.474 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01661892607808113 2022-03-16 08:56:22,474.474 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 08:56:22,474.474 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'an', 'array', '[MASK]', 'past', '##ries', 'next', '[MASK]', 'five', 'boxes', 'lined', 'up', 'next', 'to', '[MASK]', 'other', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 08:56:22,489.489 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'box', 'food', 'sign', 'table', 'plate', 'pastry', 'meat', 'hole', 'different', 'wall', 'design', 'potato', 'writing', 'pile', 'bread', 'paper', 'sandwich', 'book', 'top', 'variety', 'chicken', 'various', 'other', 'next', 'container', 'dessert', 'bag', 'mushroom', 'light', 'letter', 'bunch', 'sugar', 'large', 'menu', 'cookie', 'french', 'label', 'close', 'picture', 'full', 'hamburger', 'white', 'store', 'many', 'dog', 'tray', 'logo', 'napkin', 'shelf'] 2022-03-16 08:56:38,501.501 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'other', 'next', 'food', 'box', 'plate', 'chicken', 'array', 'pastry'] 2022-03-16 08:59:01,970.970 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:10:20 iter: 9200 speed: 309.1 images/sec total_norm: 125.5824 (130.6383) loss: 157.1022 (158.2484) masked_loss: 1.9556 (1.9489) tag_loss: 154.9592 (156.2994) time: 1.4329 (1.6563) data: 0.0002 (0.0002) to_device: 0.0049 (0.0047) time_gpu: 1.4278 (1.6514) save_time: 73.3883 (73.3883) lr: 0.000086 max mem: 26307 2022-03-16 08:59:02,331.331 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5 2022-03-16 08:59:02,331.331 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 130.73724365234375 2022-03-16 08:59:02,331.331 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.13154889178533 2022-03-16 08:59:08,168.168 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.0166626013815403 2022-03-16 08:59:08,169.169 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 08:59:08,169.169 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'of', 'rhino', '##s', '[MASK]', 'in', 'a', 'field', 'near', 'a', 'zebra', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 08:59:08,185.185 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['ground', 'shadow', 'leg', 'rock', 'zebra', 'tree', 'tail', 'zoo', 'wall', 'ear', '[UNK]', 'head', 'trunk', 'enclosure', 'animal', 'log', 'mane', 'boulder', 'dirt', 'nose', 'branch', 'stripe', 'elephant', 'mouth', 'horn', 'sand', 'eye', 'neck', 'stick', 'pen', 'back', 'fence', 'other', 'area', 'wood', 'hole', 'shade', 'next', 'grass', 'habitat', 'group', 'foot', 'face', 'pole', 'baby', 'sky', 'pig', 'water', 'body', 'couple'] 2022-03-16 08:59:24,306.306 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'field', 'ground', 'rock', 'wall', 'standing', 'couple', 'tree', 'wood', 'branch', 'animal', 'leg', 'shadow', 'tail', 'dirt', 'log', 'zoo', 'enclosure', 'stripe', 'mane', 'zebra'] 2022-03-16 09:01:47,806.806 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:07:43 iter: 9300 speed: 308.7 images/sec total_norm: 128.4512 (131.5962) loss: 158.2369 (159.9482) masked_loss: 1.8774 (1.8981) tag_loss: 155.9892 (158.0501) time: 1.4331 (1.6584) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4279 (1.6533) save_time: 73.3883 (73.3883) lr: 0.000086 max mem: 26307 2022-03-16 09:01:48,166.166 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6176470518112183 2022-03-16 09:01:48,166.166 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 170.9694366455078 2022-03-16 09:01:48,167.167 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.14512333971389 2022-03-16 09:01:54,041.041 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01666449010372162 2022-03-16 09:01:54,042.042 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 09:01:54,042.042 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'computer', '[MASK]', 'a', 'keyboard', 'sitting', '[MASK]', 'the', 'desk', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 09:01:54,058.058 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['monitor', 'computer', 'desk', 'keyboard', 'box', 'screen', 'mouse', 'stand', 'base', 'table', 'wall', 'cord', '[UNK]', 'key', 'speaker', 'logo', 'pad', 'office', 'window', 'button', 'wire', 'paper', 'book', 'lamp', 'phone', 'pen', 'light', 'drawer', 'desktop', 'laptop', 'container', 'bottle', 'cup', 'tower', 'shelf', 'television', 'top', 'handle', 'room', 'picture', 'cabinet', 'white', 'next', 'chair', 'plug', 'board', 'floor', 'cap', 'lid', 'telephone'] 2022-03-16 09:02:10,059.059 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'player', 'wall', 'base', 'stand', 'computer', 'box', 'tower', 'screen', 'desk', 'speaker', 'remote', 'mouse', 'monitor', 'logo', 'keyboard', 'cord', 'plug'] 2022-03-16 09:04:33,608.608 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:05:05 iter: 9400 speed: 308.8 images/sec total_norm: 125.5806 (129.5521) loss: 159.8548 (160.1774) masked_loss: 1.8836 (1.9076) tag_loss: 157.8070 (158.2698) time: 1.4329 (1.6580) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4277 (1.6529) save_time: 73.3883 (73.3883) lr: 0.000086 max mem: 26307 2022-03-16 09:04:33,969.969 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5142857432365417 2022-03-16 09:04:33,970.970 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 149.71566772460938 2022-03-16 09:04:33,970.970 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.15443621183697 2022-03-16 09:04:39,934.934 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.016728058457374573 2022-03-16 09:04:39,934.934 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 09:04:39,934.934 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'the', 'bathroom', 'is', 'all', 'white', 'and', 'has', 'no', 'towels', '.', 'indirect', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 09:04:39,949.949 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['mirror', 'wall', 'bathroom', 'sink', '[UNK]', 'shelf', 'curtain', 'toilet', 'cabinet', 'tile', 'window', 'outlet', 'knob', 'handle', 'floor', 'light', 'seat', 'lid', 'pipe', 'white', 'drain', 'door', 'tank', 'reflection', 'shower', 'ceiling', 'rod', 'towel', 'camera', 'bag', 'woman', 'soap', 'bottle', 'person', 'hair', 'head', 'paper', 'holder', 'rack', 'man', 'small', 'ring', 'drawer', 'hole', 'picture', 'can', 'tub', 'dish', 'frame', 'glass'] 2022-03-16 09:04:55,933.933 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'white', 'woman', 'person', 'wall', 'window', 'frame', 'handle', 'cabinet', 'mirror', 'bathroom', 'sink', 'purse', 'reflection', 'towel', 'lamp', 'curtain', 'shelf', 'outlet', 'tile', 'tub', 'rack'] 2022-03-16 09:07:19,474.474 2829:trainer.py:487 do_train_dict(): eta: 1 day, 2:02:27 iter: 9500 speed: 308.7 images/sec total_norm: 126.1663 (127.4436) loss: 155.6365 (157.4723) masked_loss: 1.8522 (1.8648) tag_loss: 153.8805 (155.6075) time: 1.4342 (1.6586) data: 0.0001 (0.0002) to_device: 0.0049 (0.0048) time_gpu: 1.4295 (1.6536) save_time: 73.3883 (73.3883) lr: 0.000086 max mem: 26307 2022-03-16 09:07:19,835.835 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.529411792755127 2022-03-16 09:07:19,835.835 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 165.87757873535156 2022-03-16 09:07:19,835.835 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.18995300928752 2022-03-16 09:07:25,779.779 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.016732526943087578 2022-03-16 09:07:25,779.779 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 09:07:25,780.780 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'man', 'riding', 'a', '[MASK]', '##board', 'across', 'a', 'cement', 'ramp', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 09:07:25,795.795 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['person', 'man', '[UNK]', 'jean', 'arm', 'leg', 'shirt', 'ground', 'hand', 'head', 'hat', 'short', 'hair', 'background', 'foot', 'beach', 'sky', 'sand', 'shoe', 'face', 'water', 'board', 'cap', 'shadow', 'boy', 'wheel', 'knee', 'woman', 'logo', 'grass', 'girl', 'ear', 'building', 'eye', 'top', 'ocean', 'mouth', 'wall', 'back', 'elbow', 'tree', 'white', 'road', 'young', 'nose', 'sleeve', 'wave', 'picture', 'line', 'fence'] 2022-03-16 09:07:41,813.813 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'ground', 'foot', 'sky', 'jean', 'leg', 'wheel', 'hat', 'knee', 'shoe', 'cement', 'ramp', 'sock'] 2022-03-16 09:10:05,252.252 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:59:49 iter: 9600 speed: 308.8 images/sec total_norm: 126.5201 (130.6445) loss: 157.8926 (157.8934) masked_loss: 1.8769 (1.8979) tag_loss: 156.0144 (155.9954) time: 1.4320 (1.6578) data: 0.0001 (0.0002) to_device: 0.0049 (0.0048) time_gpu: 1.4274 (1.6529) save_time: 73.3883 (73.3883) lr: 0.000086 max mem: 26307 2022-03-16 09:10:05,612.612 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.38235294818878174 2022-03-16 09:10:05,612.612 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 151.73374938964844 2022-03-16 09:10:05,612.612 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.20913963711139 2022-03-16 09:10:11,634.634 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.016805104911327362 2022-03-16 09:10:11,634.634 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 09:10:11,635.635 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'traffic', 'and', 'street', 'signs', '[MASK]', 'a', 'wooden', 'pole', '##ophone', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 09:10:11,650.650 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'sign', 'letter', 'sky', 'stop', 'pole', 'street', 'road', '[UNK]', 'building', 'post', 'trunk', 'car', 'light', 'window', 'grass', 'bush', 'ground', 'leaf', 'bolt', 'background', 'sidewalk', 'red', 'roof', 'house', 'branch', 'fence', 'intersection', 'line', 'arrow', 'person', 'wall', 'curb', 'front', 'man', 'traffic', 'graffiti', 'shadow', 'word', 'truck', 'shirt', 'next', 'tire', 'top', 'power', 'bracket', 'screw', 'corner', 'wire', 'side'] 2022-03-16 09:10:27,627.627 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'number', 'street', 'stop', 'tree', 'letter', 'border', 'sign', 'sky', 'wooden', 'pole', 'leaf', 'rope', 'strap'] 03-16 09:12:28.301 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 09:12:28.301 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 09:12:29.837 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 95}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 97}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 96}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 09:12:51,216.216 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:57:11 iter: 9700 speed: 308.5 images/sec total_norm: 127.6420 (129.0932) loss: 157.9305 (158.7188) masked_loss: 1.8871 (1.9309) tag_loss: 156.2153 (156.7880) time: 1.4332 (1.6597) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4280 (1.6546) save_time: 73.3883 (73.3883) lr: 0.000085 max mem: 26307 2022-03-16 09:12:51,582.582 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6176470518112183 2022-03-16 09:12:51,582.582 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 150.11776733398438 2022-03-16 09:12:51,582.582 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.2287683292311 2022-03-16 09:12:57,675.675 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01684863306581974 2022-03-16 09:12:57,676.676 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 09:12:57,676.676 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'that', 'is', 'jumping', '[MASK]', 'the', 'air', 'with', 'a', 'skate', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 09:12:57,692.692 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', '[UNK]', 'pole', 'shirt', 'building', 'man', 'hat', 'cloud', 'wheel', 'boy', 'ground', 'hand', 'shadow', 'shoe', 'person', 'head', 'light', 'street', 'tree', 'cap', 'arm', 'car', 'wire', 'line', 'jean', 'sidewalk', 'ramp', 'road', 'window', 'fence', 'sign', 'grass', 'wall', 'park', 'skate', 'board', 'leg', 'short', 'curb', 'roof', 'air', 'trick', 'hair', 'bush', 'tire', 'foot', 'railing', 'young', 'bench', 'power'] 2022-03-16 09:13:13,777.777 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'air', 'building', 'road', 'street', 'car', 'ground', 'arm', 'van', 'sign', 'sky', 'shirt', 'background', 'shadow', 'wheel', 'brick', 'hat', 'cloud', 'pole', 'wire', 'trick', 'barrel', 'shoe', 'sidewalk'] 2022-03-16 09:15:37,472.472 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:54:35 iter: 9800 speed: 308.0 images/sec total_norm: 127.4463 (128.1307) loss: 151.4152 (153.5290) masked_loss: 1.7324 (1.8380) tag_loss: 149.8911 (151.6910) time: 1.4351 (1.6625) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4302 (1.6575) save_time: 73.3883 (73.3883) lr: 0.000085 max mem: 26307 2022-03-16 09:15:37,833.833 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5428571701049805 2022-03-16 09:15:37,833.833 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 129.14585876464844 2022-03-16 09:15:37,833.833 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.28380638662011 2022-03-16 09:15:43,937.937 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.016833890229463577 2022-03-16 09:15:43,937.937 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 09:15:43,938.938 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'tennis', 'player', '[MASK]', 'swinging', 'at', 'a', 'volley', 'during', 'a', 'match', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 09:15:43,953.953 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'shoe', 'man', 'short', '[UNK]', 'sock', 'tennis', 'court', 'leg', 'hand', 'wall', 'ground', 'hat', 'ball', 'head', 'hair', 'person', 'player', 'cap', 'boy', 'logo', 'line', 'sign', 'letter', 'fence', 'uniform', 'banner', 'arm', 'shadow', 'dirt', 'game', 'chair', 'stripe', 'woman', 'net', 'outfit', 'window', 'number', 'pole', 'handle', 'tree', 'playing', 'jacket', 'background', 'top', 'match', 'blue', 'group', 'plant', 'stand'] 2022-03-16 09:16:00,018.018 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'line', 'player', 'court', 'short', 'ground', 'hair', 'match', 'wall', 'arm', 'ball', 'shirt', 'leg', 'tennis', 'shadow', 'hat', 'shoe', 'outfit', 'volley', 'sock'] 2022-03-16 09:18:23,347.347 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:51:57 iter: 9900 speed: 308.7 images/sec total_norm: 127.9107 (130.6507) loss: 159.0418 (159.7381) masked_loss: 1.8340 (1.8654) tag_loss: 157.0418 (157.8728) time: 1.4322 (1.6588) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4270 (1.6537) save_time: 73.3883 (73.3883) lr: 0.000085 max mem: 26307 2022-03-16 09:18:23,707.707 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-16 09:18:23,708.708 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 159.0889892578125 2022-03-16 09:18:23,708.708 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.26748870849609 2022-03-16 09:18:29,858.858 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.016970515251159668 2022-03-16 09:18:29,859.859 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 09:18:29,859.859 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'people', 'in', 'the', '[MASK]', 'room', 'playing', 'the', 'nintendo', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 09:18:29,874.874 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hair', 'wall', 'shirt', 'television', 'hand', 'man', 'jean', 'boy', 'game', '[UNK]', 'controller', 'picture', 'floor', 'stand', 'arm', 'remote', 'head', 'ear', 'cord', 'video', 'logo', 'wii', 'ceiling', 'room', 'strap', 'glasses', 'person', 'bracelet', 'screen', 'face', 'door', 'table', 'microphone', 'design', 'leg', 'speaker', 'light', 'girl', 'book', 'carpet', 'tv', 'paper', 'young', 'rug', 'switch', 'woman', 'dvd', 'button', 'short', 'shelf'] 2022-03-16 09:18:45,845.845 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'room', 'book', 'player', 'living', 'television', 'hair', 'tv', 'wall', 'arm', 'boy', 'stand', 'chair', 'box', 'jean', 'shirt', 'picture', 'finger', 'dvd', 'logo', 'shelf', 'controller', 'wii'] 2022-03-16 09:21:09,372.372 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:49:19 iter: 10000 speed: 308.4 images/sec total_norm: 127.0210 (130.2868) loss: 156.4432 (157.7097) masked_loss: 1.8474 (1.8374) tag_loss: 154.3111 (155.8722) time: 1.4320 (1.6602) data: 0.0001 (0.0005) to_device: 0.0050 (0.0049) time_gpu: 1.4269 (1.6548) save_time: 73.3883 (73.3883) lr: 0.000085 max mem: 26307 2022-03-16 09:21:09,374.374 2829:checkpoint.py:222 save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt 2022-03-16 09:21:18,627.627 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4545454680919647 2022-03-16 09:21:18,627.627 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 148.15859985351562 2022-03-16 09:21:18,627.627 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.28933020865563 2022-03-16 09:21:24,824.824 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.016981514170765877 2022-03-16 09:21:24,824.824 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 09:21:24,825.825 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'man', 'standing', 'in', 'front', 'of', 'a', '[MASK]', 'stop', 'sign', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 09:21:24,839.839 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['nose', 'ear', 'eye', 'man', 'hair', 'face', 'shirt', 'head', 'mouth', 'collar', 'letter', 'sign', 'pole', 'building', 'wall', 'sidewalk', 'chin', '[UNK]', 'brick', 'front', 'street', 'plant', 'stop', 'pillar', 'door', 'button', 'jacket', 'neck', 'red', 'word', 'post', 'next', 'ground', 'column', 'line', 'background', 'arm', 'window', 'person', 'handle', 'circle', 'leaf', 'white', 'road', 'car', 'lip', 'hand', 'object', 'sleeve', 'curb'] 2022-03-16 09:21:40,663.663 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'face', 'building', 'front', 'round', 'hair', 'stop', 'mouth', 'wall', 'eye', 'neck', 'letter', 'sign', 'shirt', 'nose', 'ear', 'handle', 'collar'] 2022-03-16 09:24:03,637.637 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:47:27 iter: 10100 speed: 293.8 images/sec total_norm: 126.9644 (129.0071) loss: 151.6999 (157.2151) masked_loss: 1.8096 (1.8611) tag_loss: 150.3896 (155.3540) time: 1.4336 (1.7426) data: 0.0001 (0.0002) to_device: 0.0051 (0.0048) time_gpu: 1.4282 (1.6488) save_time: 8.8805 (41.1344) lr: 0.000085 max mem: 26307 2022-03-16 09:24:03,997.997 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-16 09:24:03,997.997 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 163.59783935546875 2022-03-16 09:24:03,997.997 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.30462855918735 2022-03-16 09:24:10,207.207 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01699250005185604 2022-03-16 09:24:10,207.207 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 09:24:10,208.208 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'superiors', '[MASK]', 'has', 'nothing', 'on', 'the', 'counter', 'top', 'except', 'an', 'ipod', 'holder', '/', 'speaker', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 09:24:10,223.223 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'kitchen', 'light', '[UNK]', 'ceiling', 'floor', 'cabinet', 'stove', 'picture', 'oven', 'hood', 'refrigerator', 'table', 'door', 'vent', 'microwave', 'sink', 'handle', 'room', 'drawer', 'speaker', 'outlet', 'bottle', 'fan', 'top', 'shelf', 'chair', 'vase', 'coffee', 'clock', 'counter', 'window', 'stool', 'stand', 'maker', 'phone', 'large', 'television', 'kettle', 'plate', 'white', 'knob', 'island', 'tile', 'towel', 'bowl', 'bar', 'frame', 'holder', 'box'] 2022-03-16 09:24:26,245.245 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'top', 'door', 'light', 'nothing', 'floor', 'wall', 'window', 'kitchen', 'picture', 'scale', 'coffee', 'counter', 'handle', 'plate', 'cabinet', 'ceiling', 'maker', 'tray', 'drawer', 'outlet', 'tile', 'stove', 'knob', 'oven', 'microwave', 'vent', 'kettle', 'spacious', 'ipod'] 2022-03-16 09:26:49,748.748 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:44:49 iter: 10200 speed: 308.2 images/sec total_norm: 128.0937 (130.6670) loss: 158.9673 (159.5079) masked_loss: 1.8799 (1.9271) tag_loss: 157.1065 (157.5808) time: 1.4333 (1.6611) data: 0.0001 (0.0002) to_device: 0.0049 (0.0048) time_gpu: 1.4284 (1.6561) save_time: 8.8805 (41.1344) lr: 0.000085 max mem: 26307 2022-03-16 09:26:50,113.113 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6363636255264282 2022-03-16 09:26:50,113.113 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 138.9566650390625 2022-03-16 09:26:50,113.113 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.34250263103003 2022-03-16 09:26:56,390.390 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.017061792314052582 2022-03-16 09:26:56,391.391 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 09:26:56,391.391 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'blue', 'two', '[MASK]', '[MASK]', 'on', 'a', 'highway', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 09:26:56,406.406 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['window', 'bus', 'building', 'sign', '[UNK]', 'tire', 'road', 'sky', 'grill', 'fence', 'street', 'plate', 'license', 'wheel', 'double', 'pole', 'sidewalk', 'light', 'windshield', 'line', 'decker', 'person', 'front', 'car', 'wall', 'mirror', 'top', 'man', 'stop', 'letter', 'shirt', 'store', 'deck', 'curb', 'railing', 'roof', 'advertisement', 'door', 'woman', 'city', 'number', 'post', 'tree', 'brick', 'driver', 'gate', 'jacket', 'truck', 'coat', 'box'] 2022-03-16 09:27:12,467.467 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'number', 'line', 'building', 'top', 'road', 'street', 'light', 'woman', 'story', 'car', 'blue', 'person', 'bridge', 'highway', 'window', 'box', 'letter', 'sign', 'sky', 'shirt', 'bus', 'traffic', 'truck', 'wheel', 'mirror', 'pole', 'fence', 'sidewalk', 'tire', 'advertisement', 'grill', 'windshield'] 2022-03-16 09:29:36,056.056 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:42:12 iter: 10300 speed: 307.9 images/sec total_norm: 128.6246 (130.2854) loss: 158.3511 (158.1128) masked_loss: 1.7148 (1.7843) tag_loss: 156.4207 (156.3285) time: 1.4336 (1.6631) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4287 (1.6580) save_time: 8.8805 (41.1344) lr: 0.000084 max mem: 26307 2022-03-16 09:29:36,418.418 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5277777910232544 2022-03-16 09:29:36,418.418 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 164.6593017578125 2022-03-16 09:29:36,418.418 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.33844126187839 2022-03-16 09:29:42,732.732 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.017056787386536598 2022-03-16 09:29:42,732.732 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 09:29:42,733.733 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'brown', 'bear', '[MASK]', 'on', 'quincy', 'of', 'cement', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 09:29:42,748.748 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['nose', 'bear', 'head', 'eye', 'paw', 'ear', 'claw', 'rock', 'ground', 'shadow', 'mouth', 'face', 'water', 'leg', 'wall', 'arm', 'moss', 'brown', 'grass', 'snout', 'log', 'foot', 'stone', '[UNK]', 'nail', 'neck', 'background', 'zoo', 'ledge', 'dirt', 'reflection', 'muzzle', 'polar', 'large', 'tree', 'puddle', 'body', 'wood', 'animal', 'pond', 'fur', 'enclosure', 'top', 'trunk', 'step', 'leaf', 'tail', 'area', 'snow', 'white'] 2022-03-16 09:29:58,733.733 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'face', 'top', 'ground', 'rock', 'brown', 'eye', 'neck', 'foot', 'leg', 'nose', 'ear', 'bear', 'shadow', 'trunk', 'log', 'moss', 'cement', 'claw', 'paw'] 2022-03-16 09:32:22,324.324 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:39:35 iter: 10400 speed: 307.9 images/sec total_norm: 127.1969 (139.0600) loss: 159.6070 (161.2413) masked_loss: 1.8556 (1.9298) tag_loss: 157.9227 (159.3115) time: 1.4347 (1.6627) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4296 (1.6575) save_time: 8.8805 (41.1344) lr: 0.000084 max mem: 26307 2022-03-16 09:32:22,684.684 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5454545617103577 2022-03-16 09:32:22,684.684 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 164.86825561523438 2022-03-16 09:32:22,684.684 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.32633434477306 2022-03-16 09:32:29,033.033 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.017041685059666634 2022-03-16 09:32:29,034.034 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 09:32:29,034.034 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'person', '[MASK]', 'a', 'child', 'standing', 'outside', 'holding', 'dogs', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 09:32:29,049.049 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'hand', 'fence', 'hair', 'head', 'tree', 'boy', '[UNK]', 'logo', 'man', 'face', 'sky', 'building', 'ground', 'arm', 'window', 'water', 'ear', 'bush', 'pole', 'short', 'track', 'floor', 'person', 'roof', 'leg', 'wall', 'hat', 'nose', 'glasses', 'eye', 'mouth', 'post', 'sidewalk', 'railing', 'train', 'shoe', 'table', 'jean', 'grass', 'light', 'background', 'design', 'camera', 'trunk', 'sign', 'child', 'umbrella', 'cap', 'strap'] 2022-03-16 09:32:45,063.063 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'hand', 'face', 'woman', 'short', 'ground', 'hair', 'girl', 'outside', 'person', 'floor', 'child', 'chair', 'foot', 'tree', 'shirt', 'dog', 'leg', 'bag', 'ear', 'bush', 'hat', 'cap', 'glasses', 'fence', 'collar', 'reflection', 'shoe', 'sidewalk', 'tile', 'sweater', 'sunglasses', 'chimney', 'cushion', 'leash'] 2022-03-16 09:35:08,496.496 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:36:56 iter: 10500 speed: 308.1 images/sec total_norm: 126.7063 (129.0044) loss: 155.6191 (157.0873) masked_loss: 1.7329 (1.7761) tag_loss: 153.5807 (155.3112) time: 1.4328 (1.6617) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4278 (1.6567) save_time: 8.8805 (41.1344) lr: 0.000084 max mem: 26307 2022-03-16 09:35:08,857.857 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4848484992980957 2022-03-16 09:35:08,858.858 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 157.58441162109375 2022-03-16 09:35:08,858.858 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.31941950096274 2022-03-16 09:35:15,241.241 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.017158575356006622 2022-03-16 09:35:15,242.242 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 09:35:15,242.242 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'large', 'gray', 'teddy', 'bear', 'laying', 'on', 'activation', '##tton', 'a', 'blanket', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 09:35:15,257.257 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['nose', 'bear', 'head', 'ear', 'mouth', 'teddy', 'eye', 'bow', 'face', 'ribbon', 'arm', 'pillow', 'blanket', 'leg', 'foot', 'muzzle', 'bed', 'paw', 'stuffed', 'neck', 'cloth', 'white', '[UNK]', 'next', 'brown', 'couch', 'scarf', 'tag', 'animal', 'other', 'sheet', 'hair', 'hand', 'fur', 'flower', 'shirt', 'wall', 'tie', 'red', 'pad', 'tail', 'chair', 'floor', 'line', 'collar', 'top', 'dog', 'laying', 'small', 'letter'] 2022-03-16 09:35:31,260.260 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'face', 'large', 'top', 'control', 'mouth', 'arm', 'eye', 'foot', 'gray', 'leg', 'nose', 'ear', 'bear', 'cloth', 'blanket', 'pillow', 'ribbon', 'teddy', 'stripe', 'scarf'] 2022-03-16 09:37:54,695.695 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:34:18 iter: 10600 speed: 308.1 images/sec total_norm: 126.0428 (127.9087) loss: 157.5936 (158.6166) masked_loss: 1.8473 (1.8692) tag_loss: 155.2387 (156.7474) time: 1.4327 (1.6620) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4277 (1.6570) save_time: 8.8805 (41.1344) lr: 0.000084 max mem: 26307 2022-03-16 09:37:55,058.058 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6285714507102966 2022-03-16 09:37:55,058.058 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 143.2294921875 2022-03-16 09:37:55,058.058 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.35984388690129 2022-03-16 09:38:01,513.513 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.017201177775859833 2022-03-16 09:38:01,513.513 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 09:38:01,513.513 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'men', 'at', 'a', 'a', 'white', 'board', 'talking', 'with', 'a', 'samsung', 'sign', '##tees', 'them', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 09:38:01,529.529 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['man', 'suit', 'tie', 'shirt', 'tag', 'hair', '[UNK]', 'jacket', 'hand', 'wall', 'face', 'badge', 'head', 'glasses', 'person', 'name', 'sign', 'paper', 'table', 'floor', 'group', 'ribbon', 'flag', 'woman', 'poster', 'room', 'ear', 'book', 'light', 'chair', 'microphone', 'door', 'shoe', 'screen', 'business', 'beard', 'neck', 'nose', 'desk', 'letter', 'curtain', 'board', 'banner', 'ceiling', 'carpet', 'necklace', 'writing', 'plaque', 'bottle', 'arm'] 2022-03-16 09:38:17,546.546 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'name', 'hand', 'face', 'white', 'board', 'hair', 'person', 'floor', 'table', 'wall', 'writing', 'computer', 'letter', 'sign', 'shirt', 'screen', 'nose', 'display', 'suit', 'tie', 'tag', 'button', 'jacket', 'glasses', 'keyboard', 'badge', 'shoe', 'poster'] 2022-03-16 09:40:41,205.205 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:31:41 iter: 10700 speed: 307.5 images/sec total_norm: 127.5492 (128.6291) loss: 153.5759 (155.3939) masked_loss: 1.7093 (1.7730) tag_loss: 152.3310 (153.6210) time: 1.4340 (1.6651) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4292 (1.6600) save_time: 8.8805 (41.1344) lr: 0.000084 max mem: 26307 2022-03-16 09:40:41,568.568 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.40625 2022-03-16 09:40:41,568.568 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 153.51718139648438 2022-03-16 09:40:41,568.568 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.32912543967917 2022-03-16 09:40:48,096.096 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.017141008749604225 2022-03-16 09:40:48,096.096 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 09:40:48,097.097 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', '[MASK]', 'brown', 'bear', 'standing', 'in', 'a', 'field', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 09:40:48,112.112 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['grass', 'log', 'bear', 'ground', 'head', 'ear', 'water', 'bush', 'nose', 'leg', 'rock', 'mouth', 'eye', 'shadow', 'tree', 'snout', 'tongue', 'flower', 'face', 'plant', 'paw', 'back', 'background', '[UNK]', 'brown', 'river', 'zoo', 'dirt', 'fur', 'pond', 'wall', 'trunk', 'weed', 'boulder', 'large', 'teeth', 'wood', 'neck', 'branch', 'leaf', 'food', 'enclosure', 'fence', 'light', 'sun', 'reflection', 'walking', 'stone', 'black', 'patch'] 2022-03-16 09:41:04,088.088 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['back', 'head', 'water', 'large', 'field', 'ground', 'rock', 'mouth', 'brown', 'leg', 'background', 'tongue', 'nose', 'ear', 'bear', 'shadow', 'grass', 'bush', 'flower', 'trunk', 'pond', 'log', 'curb', 'paw'] 03-16 09:42:29.937 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 09:42:29.937 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 09:42:30.896 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 09:43:27,663.663 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:29:04 iter: 10800 speed: 307.6 images/sec total_norm: 133.1298 (138.8519) loss: 153.4497 (154.2352) masked_loss: 1.8300 (1.8623) tag_loss: 151.7598 (152.3729) time: 1.4340 (1.6646) data: 0.0001 (0.0002) to_device: 0.0049 (0.0048) time_gpu: 1.4289 (1.6596) save_time: 8.8805 (41.1344) lr: 0.000084 max mem: 26307 2022-03-16 09:43:28,024.024 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6363636255264282 2022-03-16 09:43:28,025.025 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 150.58885192871094 2022-03-16 09:43:28,025.025 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.3430706339145 2022-03-16 09:43:34,601.601 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.017158713191747665 2022-03-16 09:43:34,601.601 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 09:43:34,601.601 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'white', 'refrigerator', 'is', '[MASK]', 'a', 'work', 'space', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 09:43:34,617.617 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['refrigerator', 'wall', 'handle', 'floor', 'door', 'ground', 'box', '[UNK]', 'building', 'pole', 'room', 'ceiling', 'ladder', 'light', 'kitchen', 'beam', 'window', 'broom', 'cardboard', 'wood', 'board', 'bag', 'paper', 'cabinet', 'wheel', 'shelf', 'chair', 'sign', 'table', 'crate', 'next', 'stove', 'shadow', 'doorway', 'top', 'brick', 'shovel', 'oven', 'bucket', 'stick', 'view', 'old', 'pipe', 'tool', 'empty', 'cord', 'white', 'open', 'trash', 'picture'] 2022-03-16 09:43:50,593.593 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'work', 'room', 'white', 'door', 'ground', 'board', 'space', 'floor', 'wall', 'window', 'box', 'handle', 'ceiling', 'stick', 'pole', 'refrigerator', 'broom'] 2022-03-16 09:46:14,064.064 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:26:27 iter: 10900 speed: 307.7 images/sec total_norm: 128.8097 (130.8143) loss: 155.8550 (156.6989) masked_loss: 1.8910 (1.8794) tag_loss: 154.4810 (154.8195) time: 1.4332 (1.6640) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4282 (1.6590) save_time: 8.8805 (41.1344) lr: 0.000084 max mem: 26307 2022-03-16 09:46:14,426.426 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4285714328289032 2022-03-16 09:46:14,427.427 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 133.0312957763672 2022-03-16 09:46:14,427.427 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.3640462701971 2022-03-16 09:46:21,049.049 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.0172384325414896 2022-03-16 09:46:21,049.049 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 09:46:21,050.050 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'man', 'in', 'shirt', 'and', 'tie', '[MASK]', 'at', 'a', 'desk', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 09:46:21,065.065 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['man', 'hand', 'shirt', 'glasses', 'wall', 'tie', 'face', 'hair', 'ear', 'ring', 'head', 'table', 'watch', 'desk', 'base', 'nose', 'chair', 'finger', '[UNK]', 'arm', 'pen', 'collar', 'lamp', 'sleeve', 'wrist', 'computer', 'paper', 'microphone', 'stand', 'laptop', 'cord', 'handle', 'holder', 'glass', 'mouth', 'beard', 'phone', 'mustache', 'office', 'eye', 'woman', 'bottle', 'keyboard', 'notebook', 'tray', 'logo', 'book', 'cup', 'knot', 'wire'] 2022-03-16 09:46:37,152.152 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'hair', 'table', 'wall', 'arm', 'base', 'stand', 'chair', 'watch', 'ring', 'shirt', 'finger', 'nose', 'ear', 'desk', 'tie', 'wrist', 'glasses', 'collar', 'scissors', 'mustache'] 2022-03-16 09:49:00,685.685 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:23:50 iter: 11000 speed: 307.3 images/sec total_norm: 126.9760 (130.5846) loss: 160.7057 (161.1748) masked_loss: 1.8001 (1.8784) tag_loss: 158.8291 (159.2964) time: 1.4342 (1.6662) data: 0.0001 (0.0002) to_device: 0.0049 (0.0048) time_gpu: 1.4292 (1.6613) save_time: 8.8805 (41.1344) lr: 0.000083 max mem: 26307 2022-03-16 09:49:01,044.044 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.46875 2022-03-16 09:49:01,044.044 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 167.17774963378906 2022-03-16 09:49:01,044.044 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.39145983446825 2022-03-16 09:49:07,716.716 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.0172822754830122 2022-03-16 09:49:07,716.716 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 09:49:07,717.717 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'couple', 'of', '[MASK]', 'walking', 'down', 'a', 'dirt', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 09:49:07,732.732 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['leg', 'cow', 'grass', 'ground', 'head', 'ear', 'tree', '[UNK]', 'road', 'nose', 'path', 'tail', 'bush', 'face', 'eye', 'spot', 'horn', 'leaf', 'sky', 'dirt', 'fence', 'plant', 'herd', 'rock', 'pole', 'shirt', 'building', 'group', 'brown', 'hair', 'mud', 'gravel', 'cattle', 'field', 'light', 'forest', 'tag', 'street', 'man', 'next', 'area', 'animal', 'person', 'foot', 'hat', 'post', 'window', 'other', 'rope', 'pasture'] 2022-03-16 09:49:23,784.784 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'face', 'road', 'ground', 'couple', 'eye', 'tree', 'spot', 'path', 'leg', 'nose', 'ear', 'grass', 'tail', 'dirt', 'leaf', 'cow'] 2022-03-16 09:51:47,250.250 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:21:13 iter: 11100 speed: 307.4 images/sec total_norm: 125.0513 (126.8078) loss: 157.1694 (159.3545) masked_loss: 1.8038 (1.8433) tag_loss: 155.4464 (157.5112) time: 1.4338 (1.6656) data: 0.0001 (0.0005) to_device: 0.0050 (0.0048) time_gpu: 1.4286 (1.6602) save_time: 8.8805 (41.1344) lr: 0.000083 max mem: 26307 2022-03-16 09:51:47,612.612 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-16 09:51:47,613.613 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 166.98617553710938 2022-03-16 09:51:47,613.613 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.38234519958496 2022-03-16 09:51:54,290.290 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01734146662056446 2022-03-16 09:51:54,290.290 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 09:51:54,291.291 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'four', 'small', '[MASK]', 'fly', 'low', 'in', 'the', 'obeyed', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 09:51:54,306.306 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'tree', 'airplane', 'wing', 'tail', 'propeller', 'cloud', 'field', 'wheel', 'plane', 'grass', 'person', 'gear', 'landing', 'cockpit', 'engine', 'car', '[UNK]', 'man', 'building', 'ground', 'stripe', 'road', 'air', 'pole', 'aircraft', 'bush', 'shirt', 'fence', 'group', 'post', 'small', 'water', 'body', 'sign', 'leg', 'roof', 'nose', 'house', 'kite', 'letter', 'flag', 'tire', 'blue', 'hill', 'cloudy', 'shadow', 'dirt', 'day', 'lot'] 2022-03-16 09:52:10,392.392 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['small', 'low', 'wing', 'tree', 'sky', 'flag', 'wheel', 'tail', 'pole', 'airplane', 'cockpit', 'propeller'] 2022-03-16 09:54:33,927.927 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:18:36 iter: 11200 speed: 307.2 images/sec total_norm: 128.4007 (128.9694) loss: 158.6474 (159.2351) masked_loss: 1.8398 (1.8541) tag_loss: 156.5146 (157.3810) time: 1.4344 (1.6668) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4297 (1.6617) save_time: 8.8805 (41.1344) lr: 0.000083 max mem: 26307 2022-03-16 09:54:34,287.287 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6857143044471741 2022-03-16 09:54:34,287.287 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 157.29931640625 2022-03-16 09:54:34,287.287 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.42940487481852 2022-03-16 09:54:41,041.041 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.017329899594187737 2022-03-16 09:54:41,041.041 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 09:54:41,042.042 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'gi', '[MASK]', '##fe', 'and', 'a', 'few', 'zebra', '##s', 'in', 'a', 'sandy', 'area', 'with', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 09:54:41,057.057 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'head', 'leg', '[UNK]', 'fence', 'neck', 'ground', 'rock', 'tail', 'zebra', 'zoo', 'mane', 'ear', 'enclosure', 'pole', 'dirt', 'shadow', 'grass', 'horn', 'spot', 'trunk', 'bush', 'eye', 'sky', 'hair', 'branch', 'mouth', 'wall', 'log', 'pen', 'group', 'other', 'stripe', 'next', 'area', 'boulder', 'post', 'building', 'animal', 'couple', 'plant', 'palm', 'leaf', 'nose', 'front', 'baby', 'bird', 'feeder', 'adult', 'standing'] 2022-03-16 09:54:57,067.067 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'area', 'few', 'ground', 'rock', 'neck', 'tree', 'leg', 'ear', 'shadow', 'tail', 'pole', 'dirt', 'horn', 'sandy', 'fence', 'zoo', 'enclosure', 'stripe', 'mane', 'zebra'] 2022-03-16 09:57:20,696.696 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:16:00 iter: 11300 speed: 307.0 images/sec total_norm: 129.0781 (130.7049) loss: 153.2269 (154.9657) masked_loss: 1.8300 (1.9013) tag_loss: 151.2944 (153.0645) time: 1.4341 (1.6677) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4291 (1.6626) save_time: 8.8805 (41.1344) lr: 0.000083 max mem: 26307 2022-03-16 09:57:21,057.057 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6285714507102966 2022-03-16 09:57:21,058.058 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 160.20086669921875 2022-03-16 09:57:21,058.058 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.43586188868473 2022-03-16 09:57:28,051.051 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01731725223362446 2022-03-16 09:57:28,051.051 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 09:57:28,051.051 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'holding', 'a', 'tennis', '[MASK]', '##et', 'is', 'hitting', 'a', 'ball', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 09:57:28,067.067 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'shoe', '[UNK]', 'ball', 'court', 'man', 'line', 'tennis', 'short', 'leg', 'hand', 'sock', 'head', 'shadow', 'wall', 'player', 'sign', 'letter', 'hair', 'arm', 'logo', 'ground', 'person', 'hat', 'stand', 'banner', 'box', 'knee', 'stripe', 'floor', 'handle', 'advertisement', 'foot', 'band', 'writing', 'sleeve', 'board', 'white', 'face', 'star', 'cap', 'glove', 'spectator', 'net', 'black', 'flag', 'base', 'ear', 'wrist', 'boy'] 2022-03-16 09:57:44,116.116 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'line', 'court', 'short', 'hair', 'person', 'star', 'wall', 'arm', 'foot', 'box', 'ball', 'letter', 'piano', 'sign', 'shirt', 'handle', 'tennis', 'shadow', 'banner', 'beard', 'shoe', 'sock'] 2022-03-16 10:00:07,651.651 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:13:24 iter: 11400 speed: 306.7 images/sec total_norm: 130.1305 (131.6654) loss: 157.6465 (157.6752) masked_loss: 1.7889 (1.8046) tag_loss: 155.8576 (155.8706) time: 1.4332 (1.6696) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4281 (1.6645) save_time: 8.8805 (41.1344) lr: 0.000083 max mem: 26307 2022-03-16 10:00:08,011.011 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5714285969734192 2022-03-16 10:00:08,012.012 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 162.66915893554688 2022-03-16 10:00:08,012.012 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.44781394626783 2022-03-16 10:00:14,809.809 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01733230985701084 2022-03-16 10:00:14,809.809 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 10:00:14,810.810 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'herd', 'of', 'cattle', 'grazing', '[MASK]', 'a', 'dry', 'field', 'with', 'snow', 'off', '[MASK]', '1690', 'distance', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 10:00:14,825.825 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['ground', 'cow', 'field', 'snow', 'tree', 'animal', 'head', 'leg', 'tail', 'calf', 'water', '[UNK]', 'grass', 'trunk', 'shadow', 'puddle', 'ear', 'rock', 'horn', 'group', 'herd', 'dog', 'bird', 'cattle', 'bull', 'sky', 'branch', 'brown', 'wood', 'background', 'stick', 'couple', 'fence', 'stream', 'face', 'dry', 'sheep', 'small', 'number', 'horse', 'pole', 'open', 'bush', 'next', 'deer', 'grazing', 'large', 'paper', 'mud', 'grassy'] 2022-03-16 10:00:30,813.813 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'face', 'field', 'ground', 'distance', 'tree', 'leg', 'dry', 'snow', 'object', 'shadow', 'grass', 'tail', 'bull', 'cow', 'herd', 'calf'] 2022-03-16 10:02:54,597.597 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:10:48 iter: 11500 speed: 306.7 images/sec total_norm: 128.4373 (132.4142) loss: 155.3180 (156.7946) masked_loss: 1.8088 (1.8925) tag_loss: 153.7993 (154.9021) time: 1.4325 (1.6694) data: 0.0001 (0.0002) to_device: 0.0049 (0.0046) time_gpu: 1.4275 (1.6646) save_time: 8.8805 (41.1344) lr: 0.000083 max mem: 26307 2022-03-16 10:02:54,958.958 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5757575631141663 2022-03-16 10:02:54,959.959 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 146.6671142578125 2022-03-16 10:02:54,959.959 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.49082729734224 2022-03-16 10:03:01,848.848 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.017354995012283325 2022-03-16 10:03:01,848.848 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 10:03:01,849.849 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'gi', '[MASK]', '[MASK]', '##s', 'are', 'in', 'a', 'background', 'of', 'tall', 'trees', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 10:03:01,864.864 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', '[UNK]', 'spot', 'neck', 'sky', 'head', 'ear', 'horn', 'eye', 'mouth', 'grass', 'nose', 'leg', 'bush', 'face', 'mane', 'rock', 'field', 'zoo', 'branch', 'knee', 'shadow', 'body', 'ground', 'next', 'dirt', 'other', 'trunk', 'tail', 'fence', 'tall', 'pole', 'front', 'standing', 'hair', 'leaf', 'hill', 'wall', 'flower', 'tongue', 'area', 'grassy', 'couple', 'large', 'green', 'top', 'animal', 'cloud', 'close', 'baby'] 2022-03-16 10:03:17,822.822 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'mouth', 'eye', 'neck', 'tree', 'sky', 'spot', 'tall', 'background', 'nose', 'ear', 'horn'] 2022-03-16 10:05:41,515.515 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:08:11 iter: 11600 speed: 306.7 images/sec total_norm: 129.7579 (131.5477) loss: 155.3367 (156.3531) masked_loss: 1.8594 (1.8574) tag_loss: 153.7790 (154.4957) time: 1.4328 (1.6692) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4277 (1.6642) save_time: 8.8805 (41.1344) lr: 0.000083 max mem: 26307 2022-03-16 10:05:41,876.876 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-16 10:05:41,877.877 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 147.26007080078125 2022-03-16 10:05:41,877.877 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.52288551004524 2022-03-16 10:05:48,771.771 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01736370287835598 2022-03-16 10:05:48,771.771 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 10:05:48,771.771 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'women', 'and', 'a', 'man', 'at', 'a', 'table', 'with', 'a', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 10:05:48,787.787 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hair', 'woman', 'glasses', 'suit', 'face', 'glass', 'table', 'man', 'wall', 'necklace', 'shirt', 'microphone', 'window', 'hand', 'cup', 'jacket', 'plate', 'juice', 'tie', 'neck', 'nose', 'head', '[UNK]', 'mug', 'food', 'mouth', 'person', 'name', 'eye', 'spoon', 'chair', 'pitcher', 'bowl', 'napkin', 'cake', 'drink', 'beer', 'paper', 'sign', 'lid', 'fork', 'straw', 'knife', 'handle', 'bread', 'front', 'container', 'logo', 'ear', 'bottle'] 2022-03-16 10:06:04,871.871 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'hand', 'face', 'woman', 'cup', 'hair', 'person', 'table', 'wall', 'food', 'glass', 'eye', 'neck', 'window', 'shirt', 'drink', 'nose', 'suit', 'plate', 'beer', 'tie', 'jacket', 'glasses', 'fork', 'juice', 'lid', 'necklace', 'microphone'] 2022-03-16 10:08:28,622.622 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:05:36 iter: 11700 speed: 306.4 images/sec total_norm: 129.1662 (130.7679) loss: 154.2498 (154.9690) masked_loss: 1.7835 (1.8099) tag_loss: 152.5634 (153.1591) time: 1.4338 (1.6711) data: 0.0001 (0.0002) to_device: 0.0048 (0.0046) time_gpu: 1.4291 (1.6663) save_time: 8.8805 (41.1344) lr: 0.000082 max mem: 26307 2022-03-16 10:08:28,983.983 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.3333333432674408 2022-03-16 10:08:28,983.983 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 153.54290771484375 2022-03-16 10:08:28,984.984 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.52462878469693 2022-03-16 10:08:35,956.956 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01742728240787983 2022-03-16 10:08:35,956.956 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 10:08:35,956.956 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', '[MASK]', 'playing', 'wii', 'in', '[MASK]', 'living', 'room', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 10:08:35,973.973 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'man', 'head', 'hand', 'wall', '[UNK]', 'hair', 'lamp', 'pole', 'tent', 'arm', 'face', 'leg', 'bed', 'ear', 'jean', 'pillow', 'couch', 'table', 'shoe', 'foot', 'floor', 'umbrella', 'cord', 'nose', 'glasses', 'light', 'blanket', 'shadow', 'person', 'mouth', 'short', 'room', 'ground', 'boy', 'eye', 'stand', 'shade', 'sheet', 'watch', 'chair', 'woman', 'hat', 'sock', 'top', 'window', 'strap', 'sleeve', 'curtain', 'laptop'] 2022-03-16 10:08:52,038.038 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'hand', 'face', 'room', 'light', 'living', 'hair', 'floor', 'table', 'wall', 'arm', 'window', 'watch', 'shirt', 'ear', 'desk', 'couch', 'pole', 'remote', 'wrist', 'monitor', 'shade', 'keyboard', 'sleeve', 'lamp', 'curtain', 'controller', 'wii', 'vent'] 2022-03-16 10:11:15,709.709 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:03:00 iter: 11800 speed: 306.4 images/sec total_norm: 128.8577 (132.7833) loss: 156.1726 (156.8815) masked_loss: 1.7356 (1.8148) tag_loss: 154.0760 (155.0667) time: 1.4335 (1.6708) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4283 (1.6658) save_time: 8.8805 (41.1344) lr: 0.000082 max mem: 26307 2022-03-16 10:11:16,070.070 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.42424243688583374 2022-03-16 10:11:16,070.070 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 140.9473876953125 2022-03-16 10:11:16,070.070 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.55650919425388 2022-03-16 10:11:23,033.033 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.017437878996133804 2022-03-16 10:11:23,034.034 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 10:11:23,034.034 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'an', 'older', 'woman', '[MASK]', 'a', 'leopard', 'jacket', 'sitting', 'on', 'a', 'sep', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 10:11:23,049.049 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'sidewalk', 'shoe', 'ground', 'hair', 'man', 'person', 'bag', 'woman', 'head', 'shirt', 'hand', 'jacket', 'leg', 'wall', 'jean', 'shadow', 'curb', 'street', 'arm', 'skirt', 'building', 'coat', 'hat', 'face', 'window', 'girl', 'dress', 'phone', 'road', 'tree', 'sign', 'pole', 'car', 'wheel', 'glasses', 'umbrella', 'line', 'foot', 'boy', 'sky', 'sock', 'purse', 'bench', 'fence', 'scarf', 'boot', 'light', 'water', 'lady'] 2022-03-16 10:11:39,029.029 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'hand', 'face', 'line', 'water', 'street', 'woman', 'ground', 'hair', 'person', 'arm', 'foot', 'jean', 'leg', 'dress', 'bag', 'snow', 'bird', 'shadow', 'coat', 'bottle', 'hat', 'jacket', 'bench', 'glasses', 'purse', 'skirt', 'shoe', 'sidewalk', 'leopard', 'pigeon', 'sunglasses', 'scarf', 'bracelet', 'sock'] 03-16 10:12:30.989 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 10:12:30.989 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 10:12:32.084 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 89}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 95}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 93}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 10:14:02,733.733 2829:trainer.py:487 do_train_dict(): eta: 1 day, 1:00:23 iter: 11900 speed: 306.5 images/sec total_norm: 127.8818 (133.6626) loss: 157.7047 (159.3980) masked_loss: 1.8176 (1.8846) tag_loss: 155.8960 (157.5134) time: 1.4314 (1.6702) data: 0.0001 (0.0001) to_device: 0.0050 (0.0050) time_gpu: 1.4261 (1.6651) save_time: 8.8805 (41.1344) lr: 0.000082 max mem: 26307 2022-03-16 10:14:03,095.095 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-16 10:14:03,096.096 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 139.77464294433594 2022-03-16 10:14:03,096.096 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.57229817708334 2022-03-16 10:14:10,116.116 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.017453555017709732 2022-03-16 10:14:10,116.116 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 10:14:10,116.116 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'street', 'corner', '[MASK]', 'people', 'and', 'a', 'horse', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 10:14:10,131.131 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['man', 'shirt', 'sign', 'building', 'car', '[UNK]', 'street', 'sky', 'tree', 'window', 'roof', 'road', 'pole', 'horse', 'light', 'person', 'sidewalk', 'plate', 'tail', 'house', 'license', 'leg', 'wall', 'curb', 'door', 'hair', 'shoe', 'hat', 'tire', 'shadow', 'line', 'traffic', 'bush', 'van', 'head', 'woman', 'bus', 'cloud', 'jacket', 'truck', 'bag', 'jean', 'windshield', 'city', 'chimney', 'fence', 'stop', 'grass', 'plant', 'wire'] 2022-03-16 10:14:26,187.187 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'building', 'road', 'street', 'woman', 'car', 'person', 'child', 'wall', 'van', 'window', 'tree', 'corner', 'horse', 'sign', 'sky', 'jean', 'shirt', 'roof', 'bag', 'kid', 'plate', 'shadow', 'wheel', 'license', 'cloud', 'pole', 'jacket', 'bike', 'bicycle', 'sidewalk', 'tire', 'chimney'] 2022-03-16 10:16:49,568.568 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:57:46 iter: 12000 speed: 306.9 images/sec total_norm: 130.8918 (132.8581) loss: 158.2243 (160.7906) masked_loss: 1.7344 (1.8058) tag_loss: 156.1837 (158.9848) time: 1.4317 (1.6683) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4266 (1.6632) save_time: 8.8805 (41.1344) lr: 0.000082 max mem: 26307 2022-03-16 10:16:49,928.928 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5405405163764954 2022-03-16 10:16:49,929.929 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 192.48577880859375 2022-03-16 10:16:49,929.929 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.55899867538578 2022-03-16 10:16:56,976.976 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.017541565001010895 2022-03-16 10:16:56,976.976 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 10:16:56,977.977 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'truck', 'with', 'a', 'shovel', 'attached', 'to', 'the', '[MASK]', 'of', 'it', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 10:16:56,992.992 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tire', 'sky', 'truck', 'window', 'tree', 'road', 'mirror', 'logo', 'light', 'door', 'vest', 'pole', 'building', 'man', 'sign', '[UNK]', 'plate', 'wheel', 'street', 'windshield', 'ground', 'license', 'handle', 'number', 'snow', 'car', 'writing', 'wire', 'roof', 'front', 'bumper', 'jacket', 'line', 'head', 'rim', 'helmet', 'cab', 'traffic', 'hat', 'fence', 'person', 'safety', 'vehicle', 'house', 'driver', 'worker', 'letter', 'cone', 'grill', 'step'] 2022-03-16 10:17:12,911.911 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'number', 'building', 'white', 'door', 'road', 'front', 'light', 'car', 'ground', 'writing', 'window', 'tree', 'sign', 'sky', 'safety', 'truck', 'plate', 'shadow', 'license', 'pole', 'jacket', 'wire', 'logo', 'cab', 'rim', 'tire', 'vest', 'windshield', 'hose', 'shovel'] 2022-03-16 10:19:36,570.570 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:55:09 iter: 12100 speed: 306.6 images/sec total_norm: 130.7278 (134.0936) loss: 155.0712 (157.4512) masked_loss: 1.8360 (1.8537) tag_loss: 153.1510 (155.5975) time: 1.4336 (1.6700) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4286 (1.6649) save_time: 8.8805 (41.1344) lr: 0.000082 max mem: 26307 2022-03-16 10:19:36,932.932 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5428571701049805 2022-03-16 10:19:36,932.932 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 150.16307067871094 2022-03-16 10:19:36,933.933 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.57968120887631 2022-03-16 10:19:44,047.047 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.0176058541983366 2022-03-16 10:19:44,047.047 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 10:19:44,047.047 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'boy', 'standing', 'near', '[MASK]', 'plate', 'holding', 'a', 'bat', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 10:19:44,063.063 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'shoe', 'grass', 'dirt', 'bat', 'ground', 'tree', '[UNK]', 'fence', 'person', 'girl', 'woman', 'hair', 'shadow', 'hat', 'baseball', 'short', 'field', 'man', 'hand', 'jean', 'boy', 'head', 'pole', 'game', 'leg', 'child', 'light', 'glove', 'bench', 'ball', 'park', 'young', 'cap', 'arm', 'bottle', 'goal', 'net', 'bag', 'helmet', 'base', 'plate', 'gate', 'home', 'sunglasses', 'catcher', 'sock', 'little', 'kid', 'dress'] 2022-03-16 10:20:00,109.109 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'home', 'hand', 'little', 'street', 'light', 'woman', 'short', 'field', 'ground', 'hair', 'girl', 'person', 'arm', 'boy', 'chair', 'tree', 'shirt', 'plate', 'shadow', 'grass', 'bottle', 'hat', 'cap', 'pole', 'bench', 'dirt', 'bat', 'fence', 'helmet', 'shoe', 'glove', 'sock'] 2022-03-16 10:22:23,621.621 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:52:32 iter: 12200 speed: 306.5 images/sec total_norm: 130.1516 (132.5796) loss: 154.6140 (156.9583) masked_loss: 1.7344 (1.7705) tag_loss: 152.6684 (155.1878) time: 1.4329 (1.6705) data: 0.0001 (0.0005) to_device: 0.0050 (0.0049) time_gpu: 1.4278 (1.6651) save_time: 8.8805 (41.1344) lr: 0.000082 max mem: 26307 2022-03-16 10:22:23,983.983 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.42424243688583374 2022-03-16 10:22:23,984.984 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 163.18557739257812 2022-03-16 10:22:23,984.984 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.56819152832031 2022-03-16 10:22:31,116.116 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.017646921798586845 2022-03-16 10:22:31,116.116 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 10:22:31,117.117 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'green', 'and', 'red', 'fire', 'hydra', '##nt', 'in', 'the', 'pretending', 'of', 'a', 'yard', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 10:22:31,133.133 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'grass', 'fire', 'house', '[UNK]', 'sky', 'roof', 'window', 'trunk', 'chain', 'building', 'top', 'cap', 'park', 'porch', 'bolt', 'ground', 'chimney', 'bush', 'path', 'sidewalk', 'fence', 'green', 'red', 'base', 'pole', 'branch', 'road', 'grassy', 'stair', 'car', 'post', 'leaf', 'yellow', 'person', 'field', 'dirt', 'door', 'flower', 'background', 'rock', 'hill', 'light', 'lawn', 'blue', 'railing', 'sign', 'step', 'pine', 'wall'] 2022-03-16 10:22:47,139.139 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'house', 'building', 'top', 'park', 'red', 'car', 'fire', 'green', 'middle', 'chair', 'window', 'tree', 'sky', 'yard', 'background', 'roof', 'chain', 'grass', 'bush', 'cap', 'porch', 'trunk', 'bolt', 'driveway', 'balcony', 'sidewalk', 'plug', 'chimney'] 2022-03-16 10:25:10,640.640 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:49:55 iter: 12300 speed: 306.6 images/sec total_norm: 131.7690 (135.9162) loss: 156.7102 (155.4787) masked_loss: 1.7398 (1.7606) tag_loss: 154.9301 (153.7180) time: 1.4325 (1.6702) data: 0.0001 (0.0002) to_device: 0.0049 (0.0047) time_gpu: 1.4276 (1.6653) save_time: 8.8805 (41.1344) lr: 0.000081 max mem: 26307 2022-03-16 10:25:11,001.001 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.48571428656578064 2022-03-16 10:25:11,006.006 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 145.38250732421875 2022-03-16 10:25:11,006.006 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.56977659656155 2022-03-16 10:25:18,219.219 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.017645394429564476 2022-03-16 10:25:18,219.219 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 10:25:18,220.220 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'seated', 'at', 'a', 'chess', '[MASK]', 'with', 'a', '[MASK]', 'on', 'meek', 'other', 'side', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 10:25:18,235.235 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hand', 'ear', 'head', 'cat', 'wall', 'arm', 'eye', 'person', 'leg', 'nose', 'table', 'face', 'girl', 'shadow', 'hair', 'finger', 'woman', '[UNK]', 'picture', 'tail', 'shirt', 'paw', 'mouth', 'man', 'floor', 'elbow', 'thumb', 'boy', 'black', 'child', 'reflection', 'white', 'logo', 'top', 'photo', 'dress', 'front', 'toy', 'seat', 'chair', 'block', 'desk', 'stool', 'sweater', 'young', 'cord', 'ring', 'stand', 'tree', 'small'] 2022-03-16 10:25:34,269.269 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'other', 'head', 'man', 'hand', 'side', 'face', 'short', 'hair', 'table', 'wall', 'seat', 'arm', 'eye', 'chair', 'bar', 'block', 'shirt', 'nose', 'ear', 'cat', 'tail', 'cushion'] 2022-03-16 10:27:57,870.870 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:47:19 iter: 12400 speed: 306.2 images/sec total_norm: 128.1555 (133.2681) loss: 154.2426 (156.2095) masked_loss: 1.8579 (1.8185) tag_loss: 152.0333 (154.3910) time: 1.4340 (1.6723) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4291 (1.6672) save_time: 8.8805 (41.1344) lr: 0.000081 max mem: 26307 2022-03-16 10:27:58,230.230 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6000000238418579 2022-03-16 10:27:58,230.230 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 145.18087768554688 2022-03-16 10:27:58,230.230 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.57770806884766 2022-03-16 10:28:05,521.521 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.017643945291638374 2022-03-16 10:28:05,521.521 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 10:28:05,521.521 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'close', '-', 'up', 'of', 'a', 'mail', '##sl', '##ot', 'with', 'a', 'pair', '[MASK]', 'scissors', 'for', 'the', 'handle', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 10:28:05,537.537 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['scissors', 'handle', 'door', 'wall', 'blade', 'pair', 'screw', '[UNK]', 'bolt', 'frame', 'hole', 'lock', 'box', 'metal', 'wood', 'piece', 'object', 'wooden', 'building', 'latch', 'old', 'window', 'bracket', 'number', 'top', 'knob', 'sign', 'drawer', 'light', 'table', 'circle', 'hook', 'hand', 'small', 'strap', 'tool', 'large', 'panel', 'buckle', 'close', 'clock', 'mirror', 'picture', 'letter', 'cross', 'board', 'plate', 'design', 'slot', 'ground'] 2022-03-16 10:28:21,582.582 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'door', 'wall', 'wood', 'pair', 'handle', 'plate', 'screw', 'scissors'] 2022-03-16 10:30:45,050.050 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:44:42 iter: 12500 speed: 306.3 images/sec total_norm: 129.3221 (135.3853) loss: 156.5036 (157.8376) masked_loss: 1.7588 (1.8073) tag_loss: 154.8768 (156.0303) time: 1.4334 (1.6719) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4285 (1.6670) save_time: 8.8805 (41.1344) lr: 0.000081 max mem: 26307 2022-03-16 10:30:45,411.411 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-16 10:30:45,411.411 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 146.59719848632812 2022-03-16 10:30:45,411.411 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.58966785007053 2022-03-16 10:30:52,686.686 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.017681479454040527 2022-03-16 10:30:52,687.687 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 10:30:52,687.687 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'an', 'extremely', 'busy', 'street', 'with', 'cars', '[MASK]', 'people', 'on', 'the', 'sidewalk', 'holding', 'umbrella', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 10:30:52,702.702 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['umbrella', 'building', 'street', 'person', 'sky', 'car', 'sign', 'road', '[UNK]', 'pole', 'sidewalk', 'man', 'light', 'bag', 'city', 'jacket', 'line', 'window', 'windshield', 'jean', 'woman', 'shoe', 'coat', 'license', 'rain', 'busy', 'plate', 'store', 'tire', 'hand', 'rainy', 'purse', 'traffic', 'backpack', 'truck', 'billboard', 'curb', 'puddle', 'bus', 'fire', 'taxi', 'hood', 'banner', 'boot', 'flag', 'wall', 'shirt', 'mirror', 'group', 'ground'] 2022-03-16 10:31:08,744.744 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'city', 'man', 'hand', 'line', 'building', 'road', 'street', 'light', 'car', 'person', 'window', 'sign', 'sky', 'jean', 'traffic', 'bag', 'busy', 'pole', 'jacket', 'sidewalk', 'tire', 'umbrella', 'backpack', 'windshield'] 2022-03-16 10:33:32,383.383 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:42:06 iter: 12600 speed: 306.0 images/sec total_norm: 130.8993 (136.5166) loss: 154.5221 (154.2137) masked_loss: 1.8932 (1.8495) tag_loss: 152.8065 (152.3642) time: 1.4343 (1.6733) data: 0.0002 (0.0002) to_device: 0.0049 (0.0046) time_gpu: 1.4293 (1.6684) save_time: 8.8805 (41.1344) lr: 0.000081 max mem: 26307 2022-03-16 10:33:32,743.743 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.53125 2022-03-16 10:33:32,743.743 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 151.60655212402344 2022-03-16 10:33:32,744.744 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.58588349349856 2022-03-16 10:33:40,084.084 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01774078607559204 2022-03-16 10:33:40,085.085 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 10:33:40,085.085 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'close', 'up', '[MASK]', 'of', 'a', 'hand', 'on', 'a', 'keyboard', 'by', 'a', 'monitor', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 10:33:40,100.100 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['light', 'keyboard', 'screen', 'person', 'hand', 'laptop', 'arm', 'computer', 'button', 'key', 'man', 'leg', 'finger', '[UNK]', 'monitor', 'icon', 'table', 'desk', 'logo', 'stripe', 'ball', 'carrot', 'mouse', 'shirt', 'cord', 'pad', 'dark', 'foot', 'woman', 'remote', 'line', 'lap', 'next', 'head', 'thumb', 'television', 'image', 'paper', 'floor', 'background', 'sleeve', 'someone', 'white', 'front', 'close', 'open', 'wire', 'orange', 'picture', 'wall'] 2022-03-16 10:33:56,126.126 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['hand', 'light', 'close', 'person', 'arm', 'view', 'computer', 'screen', 'leg', 'monitor', 'keyboard', 'sleeve'] 2022-03-16 10:36:19,789.789 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:39:30 iter: 12700 speed: 305.8 images/sec total_norm: 129.5555 (133.1202) loss: 153.0539 (154.7452) masked_loss: 1.7588 (1.7756) tag_loss: 151.6066 (152.9696) time: 1.4344 (1.6741) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4291 (1.6690) save_time: 8.8805 (41.1344) lr: 0.000081 max mem: 26307 2022-03-16 10:36:20,150.150 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.59375 2022-03-16 10:36:20,150.150 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 133.58700561523438 2022-03-16 10:36:20,150.150 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.60678535699844 2022-03-16 10:36:27,501.501 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.017798686400055885 2022-03-16 10:36:27,501.501 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 10:36:27,501.501 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'brown', 'dog', 'drinking', 'water', 'medicare', 'bowl', 'next', 'to', 'a', 'mirror', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 10:36:27,517.517 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['dog', 'bowl', 'collar', 'leg', 'mirror', 'floor', 'paw', 'head', 'reflection', 'ear', 'carpet', 'mat', 'glass', '[UNK]', 'tray', 'water', 'door', 'rug', 'table', 'plate', 'tail', 'shadow', 'dish', 'handle', 'pole', 'neck', 'wall', 'frame', 'nose', 'chair', 'food', 'small', 'post', 'cup', 'eye', 'base', 'foot', 'brown', 'bag', 'railing', 'container', 'pillow', 'stand', 'step', 'metal', 'towel', 'paper', 'tile', 'cabinet', 'next'] 2022-03-16 10:36:43,464.464 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'next', 'water', 'post', 'floor', 'brown', 'glass', 'eye', 'neck', 'dog', 'leg', 'bag', 'ear', 'bowl', 'frame', 'mirror', 'pole', 'collar', 'reflection', 'carpet', 'tray', 'mat', 'paw'] 2022-03-16 10:39:07,069.069 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:36:53 iter: 12800 speed: 306.1 images/sec total_norm: 129.3338 (131.7621) loss: 155.1046 (156.8740) masked_loss: 1.7677 (1.7869) tag_loss: 152.9436 (155.0871) time: 1.4334 (1.6727) data: 0.0001 (0.0001) to_device: 0.0052 (0.0051) time_gpu: 1.4282 (1.6674) save_time: 8.8805 (41.1344) lr: 0.000081 max mem: 26307 2022-03-16 10:39:07,430.430 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5142857432365417 2022-03-16 10:39:07,430.430 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 140.60511779785156 2022-03-16 10:39:07,430.430 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.62342787158582 2022-03-16 10:39:14,816.816 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.017771968618035316 2022-03-16 10:39:14,816.816 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 10:39:14,816.816 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', '[MASK]', '[MASK]', 'with', 'a', 'single', 'light', 'and', 'two', 'street', 'signs', 'next', 'to', 'a', '5', 'eleven', 'sign', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 10:39:14,832.832 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'building', 'light', 'pole', 'street', 'window', 'traffic', 'sign', 'tree', '[UNK]', 'road', 'sidewalk', 'line', 'car', 'city', 'roof', 'wall', 'cloud', 'person', 'lamp', 'clock', 'door', 'letter', 'stop', 'man', 'post', 'intersection', 'tower', 'balcony', 'tall', 'top', 'arrow', 'flag', 'large', 'wire', 'green', 'signal', 'tire', 'tail', 'truck', 'red', 'bus', 'wing', 'railing', 'wheel', 'side', 'antenna', 'van', 'blue', 'hand'] 2022-03-16 10:39:30,744.744 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'single', 'building', 'street', 'light', 'fire', 'window', 'sign', 'sky', 'electric', 'escape', 'eleven', 'pole', 'arrow', 'balcony', 'shutter'] 2022-03-16 10:41:54,358.358 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:34:16 iter: 12900 speed: 306.1 images/sec total_norm: 128.6382 (133.1900) loss: 159.9225 (159.7316) masked_loss: 1.7283 (1.7726) tag_loss: 158.1663 (157.9591) time: 1.4334 (1.6730) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4282 (1.6679) save_time: 8.8805 (41.1344) lr: 0.000081 max mem: 26307 2022-03-16 10:41:54,718.718 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6969696879386902 2022-03-16 10:41:54,718.718 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 180.11219787597656 2022-03-16 10:41:54,718.718 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.60770240196815 2022-03-16 10:42:02,240.240 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.017777008935809135 2022-03-16 10:42:02,240.240 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 10:42:02,240.240 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'car', 'is', 'stopped', 'for', '[MASK]', 'red', 'light', 'at', 'an', 'intersection', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 10:42:02,256.256 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['car', 'sky', 'light', 'tree', 'street', 'sign', 'pole', 'person', 'plate', 'road', 'license', 'traffic', 'window', 'windshield', 'man', '[UNK]', 'building', 'van', 'number', 'sidewalk', 'mirror', 'back', 'shirt', 'reflection', 'bus', 'line', 'tail', 'logo', 'roof', 'tire', 'vehicle', 'curb', 'top', 'shadow', 'motorcycle', 'woman', 'wire', 'hood', 'arrow', 'head', 'truck', 'busy', 'next', 'bush', 'fence', 'trunk', 'intersection', 'bumper', 'city', 'group'] 2022-03-16 10:42:18,199.199 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['man', 'road', 'street', 'red', 'light', 'car', 'person', 'window', 'tree', 'sign', 'sky', 'shirt', 'bus', 'truck', 'plate', 'license', 'pole', 'intersection', 'logo', 'reflection', 'taxi', 'sidewalk', 'curb'] 03-16 10:42:32.140 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 10:42:32.140 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 10:42:33.152 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 10:44:41,841.841 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:31:40 iter: 13000 speed: 305.7 images/sec total_norm: 129.0177 (132.9092) loss: 156.4691 (157.1931) masked_loss: 1.6816 (1.7890) tag_loss: 154.4173 (155.4041) time: 1.4354 (1.6748) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4303 (1.6697) save_time: 8.8805 (41.1344) lr: 0.000080 max mem: 26307 2022-03-16 10:44:42,200.200 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6111111044883728 2022-03-16 10:44:42,200.200 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 185.89173889160156 2022-03-16 10:44:42,200.200 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.55554513712876 2022-03-16 10:44:49,717.717 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.017778154462575912 2022-03-16 10:44:49,717.717 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 10:44:49,718.718 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'woman', 'is', 'getting', 'ready', 'to', '[MASK]', 'a', 'tennis', 'ball', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 10:44:49,733.733 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'tennis', 'wall', 'shoe', 'court', 'letter', 'hand', 'shadow', 'woman', 'leg', 'ball', 'shirt', 'man', 'hat', 'head', 'dress', 'outfit', 'cap', 'arm', 'line', 'ground', 'person', 'microphone', 'player', 'stand', 'camera', 'skirt', 'logo', 'hair', 'sunglasses', 'chair', 'top', 'sock', 'watch', 'writing', 'short', 'girl', 'spectator', 'sign', 'pole', 'net', 'word', 'uniform', 'seat', 'band', 'face', 'female', 'jacket', 'handle', 'tank'] 2022-03-16 10:45:05,846.846 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'line', 'woman', 'court', 'wall', 'ready', 'ball', 'letter', 'shirt', 'leg', 'dress', 'tennis', 'shadow', 'hat', 'cap', 'logo', 'shoe', 'outfit'] 2022-03-16 10:47:29,188.188 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:29:03 iter: 13100 speed: 306.0 images/sec total_norm: 129.6933 (131.9507) loss: 152.1934 (156.0730) masked_loss: 1.7662 (1.8039) tag_loss: 150.0671 (154.2691) time: 1.4323 (1.6735) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4273 (1.6684) save_time: 8.8805 (41.1344) lr: 0.000080 max mem: 26307 2022-03-16 10:47:29,549.549 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5757575631141663 2022-03-16 10:47:29,549.549 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 169.70765686035156 2022-03-16 10:47:29,549.549 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.54891875295928 2022-03-16 10:47:37,115.115 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01780332811176777 2022-03-16 10:47:37,115.115 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 10:47:37,115.115 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'of', 'elephants', '[MASK]', 'on', 'top', 'of', 'a', 'grass', 'covered', 'field', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 10:47:37,130.130 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['elephant', 'tree', 'grass', 'leg', 'trunk', 'ear', '[UNK]', 'ground', 'shadow', 'head', 'eye', 'sky', 'bush', 'field', 'tail', 'foot', 'dirt', 'rock', 'branch', 'hill', 'path', 'green', 'grassy', 'standing', 'face', 'area', 'large', 'post', 'mountain', 'fence', 'water', 'walking', 'leaf', 'bird', 'next', 'stick', 'small', 'pole', 'forest', 'top', 'baby', 'herd', 'mouth', 'background', 'man', 'lush', 'building', 'group', 'animal', 'wall'] 2022-03-16 10:47:53,141.141 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'top', 'field', 'ground', 'rock', 'couple', 'eye', 'tree', 'sky', 'leg', 'ear', 'grass', 'bush', 'dirt', 'trunk', 'elephant'] 2022-03-16 10:50:16,821.821 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:26:27 iter: 13200 speed: 305.4 images/sec total_norm: 131.2391 (133.6553) loss: 154.5149 (156.5798) masked_loss: 1.7826 (1.8028) tag_loss: 152.3601 (154.7770) time: 1.4341 (1.6764) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4290 (1.6712) save_time: 8.8805 (41.1344) lr: 0.000080 max mem: 26307 2022-03-16 10:50:17,182.182 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5135135054588318 2022-03-16 10:50:17,183.183 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 144.74815368652344 2022-03-16 10:50:17,183.183 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.56220824736401 2022-03-16 10:50:24,759.759 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.017842743545770645 2022-03-16 10:50:24,759.759 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 10:50:24,759.759 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'with', 'a', 'tub', ',', 'windows', ',', 'a', 'sink', 'and', 'a', 'spray', 'bottle', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 10:50:24,774.774 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', '[UNK]', 'tile', 'window', 'bathroom', 'tub', 'sink', 'bottle', 'cabinet', 'floor', 'mirror', 'knob', 'door', 'handle', 'drawer', 'white', 'soap', 'vanity', 'bath', 'drain', 'curtain', 'room', 'toilet', 'tank', 'outlet', 'label', 'light', 'frame', 'towel', 'dish', 'rack', 'shelf', 'small', 'shower', 'ledge', 'kitchen', 'counter', 'top', 'pump', 'lid', 'next', 'reflection', 'board', 'pipe', 'blind', 'glass', 'rug', 'tiled', 'cup', 'seat'] 2022-03-16 10:50:40,748.748 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'door', 'floor', 'wall', 'window', 'handle', 'cabinet', 'mirror', 'bathroom', 'bottle', 'sink', 'spray', 'drawer', 'tile', 'tub', 'knob', 'vanity'] 2022-03-16 10:53:04,337.337 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:23:50 iter: 13300 speed: 305.6 images/sec total_norm: 131.4506 (133.4343) loss: 155.2207 (157.3795) masked_loss: 1.7986 (1.7909) tag_loss: 153.4039 (155.5886) time: 1.4334 (1.6751) data: 0.0001 (0.0005) to_device: 0.0050 (0.0048) time_gpu: 1.4284 (1.6698) save_time: 8.8805 (41.1344) lr: 0.000080 max mem: 26307 2022-03-16 10:53:04,698.698 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5625 2022-03-16 10:53:04,698.698 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 159.47372436523438 2022-03-16 10:53:04,699.699 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.51962593420228 2022-03-16 10:53:12,355.355 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.017834201455116272 2022-03-16 10:53:12,356.356 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 10:53:12,356.356 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'plate', 'of', 'food', 'including', 'meat', '[MASK]', 've', '##gg', '##ies', ',', 'and', 'grains', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 10:53:12,371.371 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['plate', 'food', '[UNK]', 'meat', 'table', 'mushroom', 'carrot', 'potato', 'bread', 'ham', 'beef', 'cream', 'sauce', 'sausage', 'fork', 'egg', 'tomato', 'cheese', 'knife', 'stem', 'vegetable', 'glass', 'steak', 'handle', 'pepper', 'napkin', 'cup', 'spoon', 'butter', 'fruit', 'breakfast', 'onion', 'crust', 'white', 'banana', 'garlic', 'ice', 'meal', 'bacon', 'hole', 'bowl', 'sandwich', 'bean', 'slice', 'container', 'close', 'top', 'different', 'cloth', 'chicken'] 2022-03-16 10:53:28,351.351 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'water', 'table', 'food', 'glass', 'plate', 'meat', 'bread', 'egg', 'ham', 'sandwich', 'beef', 'sauce', 'mushroom', 'crust', 'shrimp', 'onion', 'carrot'] 2022-03-16 10:55:51,780.780 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:21:13 iter: 13400 speed: 305.8 images/sec total_norm: 129.1583 (133.6286) loss: 156.1652 (155.9487) masked_loss: 1.8205 (1.8168) tag_loss: 154.6690 (154.1320) time: 1.4326 (1.6744) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4274 (1.6693) save_time: 8.8805 (41.1344) lr: 0.000080 max mem: 26307 2022-03-16 10:55:52,142.142 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7027027010917664 2022-03-16 10:55:52,142.142 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 157.70547485351562 2022-03-16 10:55:52,143.143 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.51687588161893 2022-03-16 10:55:59,897.897 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01786199025809765 2022-03-16 10:55:59,898.898 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 10:55:59,898.898 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'train', 'coming', 'down', 'the', 'tracks', 'towards', 'a', 'depot', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 10:55:59,914.914 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['track', 'sky', 'pole', 'platform', 'tree', 'train', 'light', 'station', 'sign', '[UNK]', 'bench', 'line', 'window', 'shelter', 'wire', 'sidewalk', 'roof', 'door', 'stop', 'fence', 'ground', 'building', 'street', 'front', 'cloud', 'traffic', 'man', 'structure', 'post', 'gravel', 'windshield', 'car', 'number', 'gate', 'grass', 'bush', 'person', 'can', 'shirt', 'power', 'railroad', 'wall', 'blue', 'next', 'group', 'box', 'bus', 'trash', 'lamp', 'background'] 2022-03-16 10:56:15,873.873 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'group', 'line', 'station', 'building', 'door', 'light', 'woman', 'hair', 'track', 'person', 'child', 'boy', 'structure', 'train', 'tree', 'sign', 'sky', 'shirt', 'platform', 'bag', 'clock', 'tunnel', 'hat', 'pole', 'jacket', 'bench', 'shelter', 'depot', 'shoe', 'sidewalk'] 2022-03-16 10:58:39,487.487 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:18:37 iter: 13500 speed: 305.3 images/sec total_norm: 129.2224 (132.2318) loss: 157.3427 (159.0163) masked_loss: 1.8069 (1.7993) tag_loss: 155.2148 (157.2169) time: 1.4341 (1.6771) data: 0.0001 (0.0001) to_device: 0.0051 (0.0051) time_gpu: 1.4288 (1.6718) save_time: 8.8805 (41.1344) lr: 0.000080 max mem: 26307 2022-03-16 10:58:39,848.848 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4117647111415863 2022-03-16 10:58:39,848.848 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 143.60903930664062 2022-03-16 10:58:39,848.848 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.50097319659065 2022-03-16 10:58:47,541.541 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.017866848036646843 2022-03-16 10:58:47,541.541 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 10:58:47,542.542 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'this', 'is', '[MASK]', 'del', '##i', 'counter', 'with', '[MASK]', 'variety', 'of', 'meat', 'with', 'different', '[MASK]', '##s', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 10:58:47,557.557 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sandwich', '[UNK]', 'tray', 'food', 'bread', 'meat', 'cheese', 'container', 'plate', 'display', 'egg', 'cake', 'tomato', 'bowl', 'table', 'glass', 'vegetable', 'case', 'spoon', 'onion', 'fish', 'window', 'wall', 'sign', 'light', 'different', 'paper', 'cup', 'carrot', 'picture', 'ham', 'cookie', 'bunch', 'person', 'pan', 'napkin', 'leaf', 'pastry', 'knife', 'handle', 'potato', 'shelf', 'dish', 'hamburger', 'reflection', 'pizza', 'fork', 'salad', 'large', 'label'] 2022-03-16 10:59:03,527.527 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'different', 'table', 'wall', 'variety', 'counter', 'meat', 'bread', 'egg', 'cheese', 'sandwich', 'tray', 'tile', 'lemon', 'vegetable', 'tomato'] 2022-03-16 11:01:27,182.182 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:16:00 iter: 13600 speed: 305.3 images/sec total_norm: 129.0708 (132.6013) loss: 154.2439 (156.0019) masked_loss: 1.8078 (1.8053) tag_loss: 152.4674 (154.1966) time: 1.4335 (1.6769) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4282 (1.6717) save_time: 8.8805 (41.1344) lr: 0.000080 max mem: 26307 2022-03-16 11:01:27,543.543 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-16 11:01:27,543.543 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 171.90118408203125 2022-03-16 11:01:27,543.543 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.51166350650091 2022-03-16 11:01:35,327.327 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.017877161502838135 2022-03-16 11:01:35,327.327 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 11:01:35,327.327 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'this', '[MASK]', 'a', 'large', 'group', 'of', 'people', 'standing', 'ar', '##oun', '##g', 'a', 'display', 'table', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 11:01:35,343.343 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['person', 'man', 'hair', 'tree', 'woman', 'jacket', 'building', 'glasses', 'tent', '[UNK]', 'crowd', 'head', 'shirt', 'box', 'window', 'hat', 'scarf', 'umbrella', 'sun', 'phone', 'hand', 'girl', 'sky', 'light', 'sweater', 'bag', 'group', 'cell', 'canopy', 'pole', 'purse', 'sunglasses', 'sign', 'face', 'coat', 'backpack', 'shoe', 'roof', 'paper', 'ceiling', 'book', 'boy', 'cup', 'table', 'stand', 'suit', 'cap', 'food', 'ground', 'camera'] 2022-03-16 11:01:51,257.257 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'can', 'man', 'hand', 'building', 'light', 'woman', 'ground', 'hair', 'girl', 'person', 'floor', 'table', 'seat', 'phone', 'glass', 'chair', 'tree', 'shirt', 'leg', 'plate', 'bottle', 'ceiling', 'hat', 'jacket', 'tape', 'glasses', 'purse', 'keyboard', 'tent', 'laptop', 'sweater'] 2022-03-16 11:04:14,729.729 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:13:23 iter: 13700 speed: 305.6 images/sec total_norm: 131.6675 (135.2967) loss: 154.9007 (155.5972) masked_loss: 1.8177 (1.8144) tag_loss: 152.9210 (153.7828) time: 1.4328 (1.6755) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4278 (1.6704) save_time: 8.8805 (41.1344) lr: 0.000079 max mem: 26307 2022-03-16 11:04:15,090.090 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5625 2022-03-16 11:04:15,090.090 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 131.4833984375 2022-03-16 11:04:15,091.091 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.56272937940514 2022-03-16 11:04:22,889.889 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.0178784541785717 2022-03-16 11:04:22,889.889 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 11:04:22,889.889 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'zebra', 'grazing', 'alone', 'on', '[MASK]', '[MASK]', 'with', 'grass', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 11:04:22,905.905 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['zebra', 'leg', 'grass', 'mane', 'head', 'ear', 'eye', 'stripe', 'shadow', 'neck', 'ground', 'road', 'nose', '[UNK]', 'mouth', 'tail', 'body', 'bush', 'field', 'tree', 'face', 'path', 'gravel', 'back', 'hair', 'next', 'green', 'standing', 'side', 'snout', 'dirt', 'grassy', 'rock', 'grazing', 'leaf', 'other', 'spot', 'front', 'branch', 'patch', 'couple', 'fence', 'lone', 'tall', 'lush', 'sun', 'animal', 'flower', 'area', 'camera'] 2022-03-16 11:04:38,806.806 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'road', 'field', 'mouth', 'eye', 'neck', 'path', 'leg', 'nose', 'ear', 'shadow', 'grass', 'tail', 'stripe', 'mane', 'zebra'] 2022-03-16 11:07:02,324.324 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:10:46 iter: 13800 speed: 305.5 images/sec total_norm: 128.6839 (130.4294) loss: 152.7523 (154.7021) masked_loss: 1.8484 (1.9045) tag_loss: 151.0355 (152.7975) time: 1.4326 (1.6759) data: 0.0002 (0.0002) to_device: 0.0049 (0.0048) time_gpu: 1.4277 (1.6709) save_time: 8.8805 (41.1344) lr: 0.000079 max mem: 26307 2022-03-16 11:07:02,685.685 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-16 11:07:02,686.686 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 134.0855712890625 2022-03-16 11:07:02,686.686 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.58463380662657 2022-03-16 11:07:10,540.540 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.017896931618452072 2022-03-16 11:07:10,540.540 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 11:07:10,541.541 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'this', 'is', 'a', 'picture', 'of', 'a', '[MASK]', 'sitting', 'on', '[MASK]', 'chair', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 11:07:10,556.556 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['cat', 'ear', 'flower', 'ground', 'wall', 'leg', 'plant', 'paw', 'nose', 'head', 'bench', 'wood', 'chair', 'tree', '[UNK]', 'building', 'seat', 'pot', 'leaf', 'bush', 'wooden', 'tail', 'sidewalk', 'window', 'back', 'eye', 'rope', 'branch', 'log', 'stick', 'white', 'grass', 'trunk', 'rock', 'pole', 'next', 'post', 'board', 'floor', 'weed', 'brick', 'face', 'door', 'garden', 'dirt', 'sculpture', 'gray', 'small', 'sign', 'statue'] 2022-03-16 11:07:26,496.496 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'back', 'head', 'ground', 'board', 'table', 'wall', 'seat', 'writing', 'stand', 'chair', 'plant', 'tree', 'wood', 'branch', 'sign', 'picture', 'leg', 'nose', 'ear', 'chain', 'cat', 'net', 'stick', 'flower', 'leaf', 'stem', 'hook', 'poster', 'paw'] 2022-03-16 11:09:50,126.126 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:08:10 iter: 13900 speed: 305.1 images/sec total_norm: 129.1128 (132.1185) loss: 154.0491 (156.0853) masked_loss: 1.8072 (1.8160) tag_loss: 152.3961 (154.2693) time: 1.4338 (1.6780) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4286 (1.6731) save_time: 8.8805 (41.1344) lr: 0.000079 max mem: 26307 2022-03-16 11:09:50,485.485 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6285714507102966 2022-03-16 11:09:50,485.485 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 139.5953826904297 2022-03-16 11:09:50,486.486 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.62439596993583 2022-03-16 11:09:58,342.342 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.017922593280673027 2022-03-16 11:09:58,343.343 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 11:09:58,343.343 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'small', 'dog', 'attempts', 'to', 'grab', 'a', '[MASK]', 'fr', '##is', '[MASK]', 'with', 'it', "'", 's', 'mouth', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 11:09:58,359.359 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['person', 'dog', 'grass', 'leg', 'head', 'collar', 'cone', 'mouth', 'ear', 'shadow', '[UNK]', 'foot', 'ground', 'tail', 'shirt', 'eye', 'paw', 'field', 'nose', 'body', 'neck', 'hat', 'background', 'crowd', 'face', 'tree', 'man', 'park', 'object', 'flag', 'playing', 'arm', 'toy', 'tag', 'white', 'hand', 'flower', 'back', 'green', 'small', 'blue', 'air', 'spot', 'brown', 'top', 'spectator', 'woman', 'black', 'bell', 'fur'] 2022-03-16 11:10:14,238.238 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'small', 'ground', 'blue', 'mouth', 'person', 'eye', 'foot', 'shirt', 'dog', 'leg', 'ear', 'shadow', 'grass', 'hat', 'tag', 'collar', 'cone', 'paw'] 03-16 11:12:33.157 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 11:12:33.158 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 11:12:34.165 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 11:12:37,917.917 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:05:33 iter: 14000 speed: 305.1 images/sec total_norm: 132.0177 (134.6876) loss: 154.1951 (153.9942) masked_loss: 1.7925 (1.8280) tag_loss: 152.9146 (152.1662) time: 1.4323 (1.6779) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4273 (1.6729) save_time: 8.8805 (41.1344) lr: 0.000079 max mem: 26307 2022-03-16 11:12:38,278.278 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-16 11:12:38,278.278 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 125.2453384399414 2022-03-16 11:12:38,278.278 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.6564180631164 2022-03-16 11:12:46,298.298 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01791957952082157 2022-03-16 11:12:46,299.299 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 11:12:46,299.299 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'car', 'is', 'parked', '[MASK]', '[MASK]', 'a', 'curb', 'with', 'its', 'brake', 'lights', 'on', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 11:12:46,314.314 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'light', 'grass', 'road', 'car', 'street', 'tree', 'sidewalk', 'pole', 'building', 'line', 'traffic', 'person', 'wall', 'fence', 'house', 'wire', '[UNK]', 'sign', 'tire', 'roof', 'van', 'window', 'brick', 'fire', 'license', 'cloud', 'curb', 'pillar', 'lawn', 'plate', 'city', 'bush', 'man', 'arrow', 'windshield', 'chimney', 'back', 'intersection', 'tail', 'truck', 'stop', 'bag', 'green', 'post', 'town', 'box', 'suv', 'shirt', 'column'] 2022-03-16 11:13:02,302.302 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'house', 'line', 'next', 'building', 'road', 'street', 'light', 'car', 'person', 'wall', 'tree', 'sky', 'traffic', 'brick', 'grass', 'column', 'cloud', 'pole', 'wire', 'sidewalk', 'brake', 'curb', 'pillar', 'bumper'] 2022-03-16 11:15:25,879.879 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:02:57 iter: 14100 speed: 304.8 images/sec total_norm: 130.9817 (132.8071) loss: 154.9062 (155.3979) masked_loss: 1.7900 (1.8193) tag_loss: 152.6038 (153.5786) time: 1.4333 (1.6796) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4281 (1.6744) save_time: 8.8805 (41.1344) lr: 0.000079 max mem: 26307 2022-03-16 11:15:26,240.240 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6875 2022-03-16 11:15:26,241.241 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 171.593505859375 2022-03-16 11:15:26,241.241 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.66850146441392 2022-03-16 11:15:34,201.201 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.017911667004227638 2022-03-16 11:15:34,201.201 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 11:15:34,201.201 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'elephants', 'with', '[MASK]', 'long', 'tu', '##sk', '##s', 'standing', 'at', 'outing', 'wall', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 11:15:34,216.216 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['elephant', 'trunk', 'tree', 'grass', 'eye', 'ear', 'leg', 'head', 'sky', '[UNK]', 'man', 'person', 'ground', 'rock', 'shirt', 'jacket', 'wall', 'structure', 'foot', 'short', 'bush', 'roof', 'fence', 'mouth', 'zoo', 'water', 'dirt', 'box', 'building', 'path', 'tank', 'large', 'hat', 'barrel', 'leaf', 'hair', 'couple', 'can', 'field', 'post', 'log', 'stick', 'stump', 'woman', 'chain', 'container', 'front', 'top', 'next', 'tail'] 2022-03-16 11:15:50,198.198 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'long', 'man', 'ground', 'rock', 'person', 'wall', 'eye', 'foot', 'tree', 'sky', 'shirt', 'leg', 'ear', 'grass', 'dirt', 'trunk', 'elephant', 'container'] 2022-03-16 11:18:13,681.681 2829:trainer.py:487 do_train_dict(): eta: 1 day, 0:00:20 iter: 14200 speed: 305.1 images/sec total_norm: 132.7275 (135.7271) loss: 154.6880 (155.5748) masked_loss: 1.7193 (1.7565) tag_loss: 152.7029 (153.8183) time: 1.4329 (1.6780) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4277 (1.6728) save_time: 8.8805 (41.1344) lr: 0.000079 max mem: 26307 2022-03-16 11:18:14,044.044 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.625 2022-03-16 11:18:14,045.045 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 148.77853393554688 2022-03-16 11:18:14,045.045 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.67060537271567 2022-03-16 11:18:22,091.091 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.017940057441592216 2022-03-16 11:18:22,091.091 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 11:18:22,092.092 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'bed', 'has', 'a', 'wooden', 'ɒ', 'and', 'a', 'brown', 'blanket', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 11:18:22,107.107 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['bed', 'floor', 'sheet', 'blanket', 'towel', 'wall', 'mattress', '[UNK]', 'room', 'carpet', 'chair', 'pillow', 'paper', 'frame', 'window', 'leg', 'shadow', 'table', 'bar', 'curtain', 'drawer', 'door', 'bag', 'railing', 'rack', 'person', 'bedroom', 'rail', 'post', 'bunk', 'light', 'wooden', 'cushion', 'clothes', 'dresser', 'backpack', 'cabinet', 'shade', 'suitcase', 'top', 'ladder', 'handle', 'bench', 'pole', 'large', 'wood', 'book', 'lamp', 'furniture', 'outlet'] 2022-03-16 11:18:38,067.067 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['room', 'door', 'person', 'floor', 'bed', 'wall', 'brown', 'paper', 'bar', 'leg', 'wooden', 'frame', 'handle', 'shadow', 'doorway', 'sheet', 'furniture', 'blanket', 'pillow', 'carpet', 'towel', 'drawer', 'mattress', 'rack', 'dresser'] 2022-03-16 11:21:01,559.559 2829:trainer.py:487 do_train_dict(): eta: 23:57:43 iter: 14300 speed: 305.0 images/sec total_norm: 130.4659 (135.9736) loss: 153.5232 (156.5875) masked_loss: 1.7260 (1.7753) tag_loss: 151.6988 (154.8122) time: 1.4333 (1.6788) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4282 (1.6737) save_time: 8.8805 (41.1344) lr: 0.000078 max mem: 26307 2022-03-16 11:21:01,925.925 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.34375 2022-03-16 11:21:01,925.925 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 115.0206298828125 2022-03-16 11:21:01,925.925 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.71441623899672 2022-03-16 11:21:10,006.006 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01796768233180046 2022-03-16 11:21:10,007.007 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 11:21:10,007.007 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'surfer', 'walking', 'along', '[MASK]', 'sidewalk', '[MASK]', 'a', 'beach', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 11:21:10,022.022 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['building', 'window', 'pole', 'fence', 'railing', '[UNK]', 'shirt', 'wall', 'man', 'head', 'ground', 'sign', 'person', 'leg', 'tree', 'shoe', 'arm', 'logo', 'car', 'boy', 'roof', 'balcony', 'hand', 'hat', 'sky', 'light', 'hair', 'box', 'helmet', 'jean', 'trash', 'door', 'grass', 'stair', 'short', 'can', 'jacket', 'banner', 'dirt', 'wheel', 'woman', 'tail', 'sand', 'rail', 'line', 'flag', 'step', 'sidewalk', 'bin', 'bench'] 2022-03-16 11:21:25,871.871 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'building', 'cup', 'short', 'ground', 'rock', 'board', 'chair', 'window', 'step', 'box', 'beach', 'shirt', 'leg', 'gate', 'sand', 'hat', 'pole', 'fence', 'balcony', 'sidewalk', 'railing', 'grill', 'surfer'] 2022-03-16 11:23:49,600.600 2829:trainer.py:487 do_train_dict(): eta: 23:55:06 iter: 14400 speed: 304.7 images/sec total_norm: 128.5652 (129.8504) loss: 152.8820 (153.2037) masked_loss: 1.7608 (1.7700) tag_loss: 150.4307 (151.4338) time: 1.4338 (1.6804) data: 0.0001 (0.0005) to_device: 0.0051 (0.0049) time_gpu: 1.4286 (1.6750) save_time: 8.8805 (41.1344) lr: 0.000078 max mem: 26307 2022-03-16 11:23:49,960.960 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4848484992980957 2022-03-16 11:23:49,960.960 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 141.3422393798828 2022-03-16 11:23:49,960.960 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.74001364872373 2022-03-16 11:23:58,086.086 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018045194447040558 2022-03-16 11:23:58,087.087 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 11:23:58,087.087 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'yellow', 'cat', 'laying', 'down', 'with', 'a', 'white', '[MASK]', 'on', 'its', 'head', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 11:23:58,102.102 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['nose', 'eye', 'hat', 'head', 'face', 'blanket', 'wall', '[UNK]', 'bed', 'flower', 'cat', 'book', 'fur', 'towel', 'tail', 'light', 'hair', 'room', 'ceiling', 'table', 'lamp', 'white', 'body', 'pillow', 'frame', 'picture', 'chair', 'mouth', 'chest', 'leg', 'shelf', 'feather', 'animal', 'curtain', 'mirror', 'design', 'window', 'nightstand', 'person', 'top', 'fluffy', 'dog', 'sheet', 'cushion', 'button', 'couch', 'background', 'large', 'stripe', 'star'] 2022-03-16 11:24:14,140.140 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'face', 'body', 'white', 'bed', 'wall', 'eye', 'rail', 'nose', 'frame', 'cat', 'speaker', 'ceiling', 'hat', 'blanket', 'towel', 'stripe'] 2022-03-16 11:26:37,616.616 2829:trainer.py:487 do_train_dict(): eta: 23:52:30 iter: 14500 speed: 304.7 images/sec total_norm: 130.3180 (131.9520) loss: 154.3104 (156.0449) masked_loss: 1.6773 (1.6853) tag_loss: 152.8525 (154.3596) time: 1.4331 (1.6802) data: 0.0001 (0.0001) to_device: 0.0051 (0.0051) time_gpu: 1.4277 (1.6749) save_time: 8.8805 (41.1344) lr: 0.000078 max mem: 26307 2022-03-16 11:26:37,976.976 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-16 11:26:37,976.976 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 176.95310974121094 2022-03-16 11:26:37,977.977 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.74564669883415 2022-03-16 11:26:46,123.123 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018046459183096886 2022-03-16 11:26:46,124.124 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 11:26:46,124.124 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'parade', 'of', 'policemen', 'on', 'motorcycles', 'are', 'escorting', '[MASK]', 'bus', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 11:26:46,139.139 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['motorcycle', 'tire', 'road', 'light', 'windshield', '[UNK]', 'man', 'street', 'shadow', 'helmet', 'tree', 'person', 'bike', 'window', 'building', 'car', 'sign', 'pole', 'truck', 'mirror', 'line', 'wheel', 'head', 'license', 'shirt', 'plate', 'bus', 'sky', 'horn', 'grass', 'flag', 'fence', 'hat', 'police', 'jacket', 'bush', 'jean', 'officer', 'woman', 'hair', 'parade', 'front', 'traffic', 'bag', 'roof', 'sidewalk', 'van', 'stripe', 'curb', 'ladder'] 2022-03-16 11:27:02,035.035 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'line', 'building', 'road', 'street', 'light', 'car', 'person', 'officer', 'tree', 'sign', 'sky', 'bus', 'vehicle', 'truck', 'plate', 'shadow', 'license', 'pole', 'parade', 'motorcycle', 'helmet', 'tire', 'policeman', 'windshield'] 2022-03-16 11:29:25,702.702 2829:trainer.py:487 do_train_dict(): eta: 23:49:53 iter: 14600 speed: 304.6 images/sec total_norm: 132.5196 (133.7473) loss: 157.6348 (158.9924) masked_loss: 1.8264 (1.7979) tag_loss: 155.2544 (157.1945) time: 1.4330 (1.6808) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4279 (1.6757) save_time: 8.8805 (41.1344) lr: 0.000078 max mem: 26307 2022-03-16 11:29:26,063.063 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6176470518112183 2022-03-16 11:29:26,063.063 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 159.8418731689453 2022-03-16 11:29:26,063.063 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.7567613562759 2022-03-16 11:29:34,259.259 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018040377646684647 2022-03-16 11:29:34,259.259 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 11:29:34,259.259 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'the', '[MASK]', 'people', 'are', 'gathered', 'together', 'in', 'the', 'living', 'room', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 11:29:34,275.275 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hair', 'glasses', 'woman', 'person', 'curtain', 'shirt', 'hand', 'wall', 'head', 'room', 'hat', 'man', 'cup', 'laptop', 'girl', 'face', 'couch', 'table', 'jean', 'window', 'jacket', 'computer', '[UNK]', 'group', 'television', 'chair', 'blanket', 'boy', 'sweater', 'food', 'screen', 'glass', 'pillow', 'ear', 'keyboard', 'picture', 'bed', 'plate', 'coffee', 'floor', 'bag', 'shoe', 'monitor', 'book', 'lid', 'box', 'cap', 'watch', 'ponytail', 'light'] 2022-03-16 11:29:50,166.166 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'room', 'book', 'woman', 'cup', 'living', 'hair', 'girl', 'person', 'floor', 'table', 'wall', 'glass', 'chair', 'plant', 'watch', 'box', 'jean', 'shirt', 'screen', 'kid', 'speaker', 'hat', 'couch', 'jacket', 'globe', 'glasses', 'pillow', 'curtain', 'shelf', 'laptop', 'scarf'] 2022-03-16 11:32:13,736.736 2829:trainer.py:487 do_train_dict(): eta: 23:47:16 iter: 14700 speed: 304.7 images/sec total_norm: 131.3952 (134.9033) loss: 152.5495 (154.7191) masked_loss: 1.7493 (1.7666) tag_loss: 150.5648 (152.9524) time: 1.4330 (1.6803) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4280 (1.6752) save_time: 8.8805 (41.1344) lr: 0.000078 max mem: 26307 2022-03-16 11:32:14,098.098 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7058823704719543 2022-03-16 11:32:14,098.098 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 193.5040283203125 2022-03-16 11:32:14,099.099 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.73174600343447 2022-03-16 11:32:22,380.380 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01804724894464016 2022-03-16 11:32:22,380.380 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 11:32:22,380.380 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'room', 'is', 'in', 'the', 'process', '[MASK]', 'being', 'renovated', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 11:32:22,396.396 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'floor', 'book', 'table', 'outlet', 'room', 'cord', '[UNK]', 'television', 'shelf', 'chair', 'couch', 'bag', 'light', 'living', 'ground', 'coffee', 'speaker', 'plug', 'window', 'box', 'remote', 'switch', 'bottle', 'stand', 'wire', 'cup', 'screen', 'sofa', 'lamp', 'magazine', 'frame', 'wii', 'leg', 'paper', 'tv', 'controller', 'laptop', 'can', 'top', 'door', 'ceiling', 'candle', 'control', 'desk', 'toy', 'pillow', 'dvd', 'phone', 'fire'] 2022-03-16 11:32:38,334.334 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'room', 'book', 'door', 'living', 'television', 'ground', 'floor', 'table', 'wall', 'process', 'chair', 'coffee', 'bag', 'device', 'bottle', 'speaker', 'couch', 'switch', 'doorway', 'closet', 'cord', 'outlet', 'candle', 'socket'] 2022-03-16 11:35:01,893.893 2829:trainer.py:487 do_train_dict(): eta: 23:44:39 iter: 14800 speed: 304.5 images/sec total_norm: 129.2645 (131.0651) loss: 154.9791 (155.8498) masked_loss: 1.6983 (1.7583) tag_loss: 152.9878 (154.0915) time: 1.4332 (1.6816) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4282 (1.6765) save_time: 8.8805 (41.1344) lr: 0.000078 max mem: 26307 2022-03-16 11:35:02,254.254 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.3888888955116272 2022-03-16 11:35:02,254.254 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 175.73284912109375 2022-03-16 11:35:02,254.254 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.71537181675033 2022-03-16 11:35:10,513.513 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018120817840099335 2022-03-16 11:35:10,513.513 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 11:35:10,514.514 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'person', '[MASK]', 'is', 'skiing', 'in', 'a', 'des', '##olate', '[MASK]', 'snowy', 'area', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 11:35:10,529.529 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'jacket', 'ski', 'snow', 'helmet', 'ground', '[UNK]', 'lift', 'tree', 'cloud', 'person', 'child', 'pole', 'coat', 'wire', 'boy', 'glove', 'chair', 'head', 'building', 'kid', 'girl', 'boot', 'man', 'skier', 'track', 'mountain', 'hand', 'hat', 'light', 'tower', 'sign', 'fence', 'slope', 'hill', 'shadow', 'foot', 'line', 'roof', 'car', 'shoe', 'leg', 'snowy', 'window', 'house', 'cable', 'bench', 'group', 'background', 'arm'] 2022-03-16 11:35:26,477.477 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'area', 'line', 'building', 'road', 'ground', 'person', 'child', 'sun', 'mountain', 'tree', 'sky', 'leg', 'background', 'snow', 'cloud', 'lift', 'pole', 'jacket', 'wire', 'ski', 'helmet', 'glove', 'snowy'] 2022-03-16 11:37:50,121.121 2829:trainer.py:487 do_train_dict(): eta: 23:42:03 iter: 14900 speed: 304.4 images/sec total_norm: 130.1726 (133.2278) loss: 156.5107 (156.8679) masked_loss: 1.6728 (1.7032) tag_loss: 154.7569 (155.1647) time: 1.4331 (1.6823) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4281 (1.6772) save_time: 8.8805 (41.1344) lr: 0.000078 max mem: 26307 2022-03-16 11:37:50,482.482 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5833333134651184 2022-03-16 11:37:50,482.482 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 182.0166015625 2022-03-16 11:37:50,482.482 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.67814407348632 2022-03-16 11:37:58,796.796 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01809064671397209 2022-03-16 11:37:58,797.797 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 11:37:58,797.797 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'sign', 'on', 'a', 'street', 'post', '[MASK]', 'smiling', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 11:37:58,812.812 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sign', 'building', 'window', 'sky', 'letter', 'pole', 'wall', '[UNK]', 'arrow', 'street', 'light', 'city', 'banner', 'writing', 'word', 'store', 'traffic', 'flag', 'air', 'cloud', 'side', 'escape', 'roof', 'tall', 'tree', 'large', 'different', 'logo', 'corner', 'skyscraper', 'door', 'front', 'lamp', 'circle', 'number', 'tower', 'balcony', 'many', 'way', 'line', 'person', 'red', 'car', 'hand', 'blue', 'next', 'bus', 'view', 'brick', 'glass'] 2022-03-16 11:38:14,781.781 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'building', 'street', 'post', 'word', 'wall', 'window', 'letter', 'sign', 'sky', 'arrow', 'banner', 'bolt', 'screw'] 2022-03-16 11:40:38,449.449 2829:trainer.py:487 do_train_dict(): eta: 23:39:27 iter: 15000 speed: 304.2 images/sec total_norm: 130.9289 (133.9651) loss: 153.5126 (156.2343) masked_loss: 1.7264 (1.7388) tag_loss: 151.7011 (154.4955) time: 1.4340 (1.6833) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4288 (1.6782) save_time: 8.8805 (41.1344) lr: 0.000077 max mem: 26307 2022-03-16 11:40:38,451.451 2829:checkpoint.py:222 save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt 2022-03-16 11:40:47,847.847 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4000000059604645 2022-03-16 11:40:47,848.848 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 154.01416015625 2022-03-16 11:40:47,848.848 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.68138496449451 2022-03-16 11:40:56,230.230 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.0181103702634573 2022-03-16 11:40:56,230.230 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 11:40:56,230.230 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'some', '[MASK]', 'playing', 'with', 'kite', '[MASK]', 'on', 'the', 'beach', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 11:40:56,245.245 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'kite', 'car', 'tree', 'person', 'jean', 'ground', 'house', 'man', 'string', 'tail', 'lot', 'building', 'shadow', 'shirt', 'parking', 'jacket', '[UNK]', 'bag', 'cloud', 'beach', 'pole', 'snow', 'roof', 'flag', 'child', 'woman', 'sand', 'hair', 'coat', 'backpack', 'air', 'truck', 'head', 'chair', 'hat', 'park', 'umbrella', 'van', 'sign', 'background', 'suv', 'ski', 'tent', 'balloon', 'wheel', 'vehicle', 'parachute', 'top', 'pile'] 2022-03-16 11:41:12,097.097 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['man', 'house', 'building', 'car', 'ground', 'person', 'lot', 'tree', 'beach', 'sky', 'jean', 'roof', 'bag', 'snow', 'truck', 'string', 'flag', 'parking', 'tail', 'cloud', 'jacket', 'umbrella', 'backpack', 'kite'] 03-16 11:42:34.197 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 11:42:34.197 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 11:42:35.427 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 11:43:35,193.193 2829:trainer.py:487 do_train_dict(): eta: 23:37:19 iter: 15100 speed: 289.7 images/sec total_norm: 131.6978 (133.8174) loss: 155.8946 (155.8844) masked_loss: 1.7415 (1.7408) tag_loss: 154.0142 (154.1436) time: 1.4346 (1.7674) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4294 (1.6720) save_time: 9.0279 (30.4322) lr: 0.000077 max mem: 26307 2022-03-16 11:43:35,555.555 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6363636255264282 2022-03-16 11:43:35,555.555 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 165.35296630859375 2022-03-16 11:43:35,555.555 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.66675457201507 2022-03-16 11:43:43,960.960 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018129440024495125 2022-03-16 11:43:43,960.960 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 11:43:43,961.961 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'snow', 'board', 'sticking', 'out', 'of', 'the', 'deep', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 11:43:43,976.976 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'sky', 'snow', 'fence', '[UNK]', 'ground', 'mountain', 'cloud', 'person', 'boot', 'board', 'ski', 'shoe', 'helmet', 'background', 'jacket', 'glove', 'foot', 'pole', 'man', 'building', 'leaf', 'head', 'pine', 'blue', 'branch', 'hand', 'arm', 'hill', 'track', 'leg', 'shadow', 'bag', 'face', 'coat', 'hat', 'rock', 'top', 'trunk', 'bush', 'backpack', 'grass', 'snowy', 'logo', 'strap', 'design', 'boy', 'plant', 'stick', 'roof'] 2022-03-16 11:43:59,791.791 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'ground', 'board', 'deep', 'mountain', 'tree', 'sky', 'snow', 'bush', 'cloud', 'trunk', 'fence', 'helmet', 'shoe', 'strap'] 2022-03-16 11:46:23,605.605 2829:trainer.py:487 do_train_dict(): eta: 23:34:42 iter: 15200 speed: 304.0 images/sec total_norm: 129.5658 (132.0758) loss: 155.2314 (155.5071) masked_loss: 1.8529 (1.8322) tag_loss: 153.0607 (153.6749) time: 1.4332 (1.6842) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4280 (1.6791) save_time: 9.0279 (30.4322) lr: 0.000077 max mem: 26307 2022-03-16 11:46:23,966.966 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7272727489471436 2022-03-16 11:46:23,966.966 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 142.22222900390625 2022-03-16 11:46:23,967.967 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.69110715778825 2022-03-16 11:46:32,409.409 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018128776922822 2022-03-16 11:46:32,409.409 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 11:46:32,409.409 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'programmes', 'sits', 'in', 'front', 'of', 'a', '[MASK]', 'and', 'a', 'counter', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 11:46:32,425.425 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', '[UNK]', 'shelf', 'sink', 'towel', 'window', 'bathroom', 'bottle', 'cup', 'container', 'cross', 'soap', 'box', 'lid', 'mirror', 'handle', 'can', 'star', 'ledge', 'candle', 'tile', 'bag', 'pipe', 'door', 'hook', 'floor', 'ceiling', 'paper', 'toilet', 'sponge', 'holder', 'basket', 'light', 'robe', 'cabinet', 'curtain', 'book', 'tank', 'tub', 'reflection', 'tissue', 'bowl', 'trash', 'frame', 'white', 'cord', 'rack', 'flag', 'roll', 'jar'] 2022-03-16 11:46:48,365.365 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'front', 'light', 'cup', 'floor', 'wall', 'cross', 'window', 'sign', 'bag', 'counter', 'mirror', 'bathroom', 'bottle', 'ceiling', 'sink', 'soap', 'keyboard', 'holder', 'towel', 'ribbon', 'shelf', 'container', 'drain', 'tile', 'glove', 'ledge', 'comb'] 2022-03-16 11:49:11,908.908 2829:trainer.py:487 do_train_dict(): eta: 23:32:05 iter: 15300 speed: 304.2 images/sec total_norm: 135.8611 (139.7964) loss: 148.9173 (150.7546) masked_loss: 1.7634 (1.7232) tag_loss: 147.4116 (149.0314) time: 1.4327 (1.6830) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4275 (1.6778) save_time: 9.0279 (30.4322) lr: 0.000077 max mem: 26307 2022-03-16 11:49:12,270.270 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6060606241226196 2022-03-16 11:49:12,271.271 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 178.60105895996094 2022-03-16 11:49:12,271.271 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.68669366217279 2022-03-16 11:49:20,858.858 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01814165525138378 2022-03-16 11:49:20,859.859 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 11:49:20,859.859 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'lady', 'in', 'dark', 'clothes', 'with', '[MASK]', 'dark', 'bag', 'and', 'a', '[MASK]', 'bell', 'umbrella', 'is', 'standing', '[MASK]', 'neatly', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 11:49:20,875.875 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['grass', 'umbrella', 'tree', 'sky', 'fence', 'person', 'woman', 'park', 'jacket', 'hand', '[UNK]', 'field', 'bag', 'coat', 'handle', 'car', 'sidewalk', 'background', 'pole', 'purse', 'hair', 'bench', 'leg', 'ground', 'road', 'building', 'shoe', 'girl', 'lady', 'head', 'shirt', 'trash', 'can', 'strap', 'light', 'jean', 'cloud', 'leaf', 'post', 'boot', 'green', 'rain', 'hood', 'dirt', 'trunk', 'man', 'bush', 'dress', 'path', 'arm'] 2022-03-16 11:49:36,902.902 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'can', 'hand', 'park', 'woman', 'dark', 'blue', 'post', 'person', 'lady', 'tree', 'sky', 'leg', 'bell', 'bag', 'handle', 'coat', 'grass', 'pole', 'fence', 'trash', 'umbrella', 'fencing'] 2022-03-16 11:52:00,579.579 2829:trainer.py:487 do_train_dict(): eta: 23:29:29 iter: 15400 speed: 303.6 images/sec total_norm: 132.1841 (136.1458) loss: 153.2925 (154.6275) masked_loss: 1.6942 (1.7564) tag_loss: 151.1323 (152.8711) time: 1.4336 (1.6867) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4283 (1.6815) save_time: 9.0279 (30.4322) lr: 0.000077 max mem: 26307 2022-03-16 11:52:00,940.940 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5428571701049805 2022-03-16 11:52:00,940.940 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 131.2457275390625 2022-03-16 11:52:00,940.940 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.71586608886719 2022-03-16 11:52:09,561.561 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018207857385277748 2022-03-16 11:52:09,561.561 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 11:52:09,562.562 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'snow', 'board', '##er', 'riding', '[MASK]', 'a', 'snow', '[MASK]', 'summit', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 11:52:09,577.577 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'snow', '[UNK]', 'man', 'jacket', 'ground', 'hat', 'person', 'glove', 'sky', 'coat', 'head', 'arm', 'hand', 'snowy', 'leg', 'pole', 'helmet', 'foot', 'ski', 'track', 'board', 'hill', 'slope', 'boot', 'cap', 'mountain', 'shoe', 'group', 'hood', 'day', 'skier', 'backpack', 'area', 'pine', 'trunk', 'face', 'side', 'cloud', 'sun', 'top', 'winter', 'poles', 'couple', 'woman', 'sign', 'forest', 'building', 'footprint', 'country'] 2022-03-16 11:52:25,528.528 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'ground', 'rock', 'arm', 'mountain', 'covered', 'tree', 'sky', 'snow', 'coat', 'hat', 'summit', 'jacket', 'glove'] 2022-03-16 11:54:49,147.147 2829:trainer.py:487 do_train_dict(): eta: 23:26:52 iter: 15500 speed: 303.7 images/sec total_norm: 131.8490 (134.0243) loss: 155.5661 (156.3271) masked_loss: 1.6858 (1.7185) tag_loss: 153.8783 (154.6086) time: 1.4333 (1.6857) data: 0.0001 (0.0005) to_device: 0.0051 (0.0049) time_gpu: 1.4283 (1.6803) save_time: 9.0279 (30.4322) lr: 0.000077 max mem: 26307 2022-03-16 11:54:49,507.507 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6571428775787354 2022-03-16 11:54:49,507.507 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 162.46498107910156 2022-03-16 11:54:49,507.507 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.74902774126102 2022-03-16 11:54:58,154.154 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018178725615143776 2022-03-16 11:54:58,155.155 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 11:54:58,155.155 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'driver', 'examining', 'a', 'minor', 'traffic', 'accident', 'between', 'a', 'bus', 'and', 'a', 'car', '[MASK]', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 11:54:58,170.170 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['jean', 'road', 'man', 'plate', 'bus', 'license', 'street', '[UNK]', 'jacket', 'windshield', 'sign', 'window', 'shoe', 'sky', 'light', 'tire', 'person', 'shadow', 'shirt', 'building', 'leg', 'car', 'hair', 'mirror', 'tree', 'bumper', 'ladder', 'rack', 'number', 'hat', 'pole', 'line', 'wheel', 'woman', 'truck', 'head', 'stripe', 'sidewalk', 'flag', 'cloud', 'letter', 'bag', 'curb', 'van', 'hand', 'logo', 'city', 'front', 'fence', 'door'] 2022-03-16 11:55:14,067.067 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'number', 'line', 'building', 'road', 'street', 'light', 'woman', 'car', 'hair', 'wall', 'window', 'tree', 'store', 'minor', 'sign', 'sky', 'jean', 'shirt', 'bus', 'traffic', 'driver', 'leg', 'accident', 'plate', 'mirror', 'coat', 'license', 'cloud', 'jacket', 'logo', 'shoe', 'sidewalk', 'tire', 'curb', 'grill', 'windshield', 'bumper'] 2022-03-16 11:57:37,738.738 2829:trainer.py:487 do_train_dict(): eta: 23:24:15 iter: 15600 speed: 303.7 images/sec total_norm: 130.5420 (133.6573) loss: 153.7613 (154.1538) masked_loss: 1.6774 (1.7158) tag_loss: 152.5827 (152.4379) time: 1.4335 (1.6859) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4282 (1.6808) save_time: 9.0279 (30.4322) lr: 0.000077 max mem: 26307 2022-03-16 11:57:38,101.101 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5625 2022-03-16 11:57:38,101.101 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 164.8413543701172 2022-03-16 11:57:38,101.101 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.73642103839073 2022-03-16 11:57:46,803.803 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01817481964826584 2022-03-16 11:57:46,804.804 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 11:57:46,804.804 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'large', 'black', 'bird', 'perched', 'next', 'to', 'an', 'outdoor', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 11:57:46,819.819 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['clock', 'building', 'window', 'hand', 'bird', 'number', 'statue', 'sculpture', 'beak', 'wing', 'sky', 'head', '[UNK]', 'horse', 'large', 'face', 'wall', 'feather', 'tail', 'skyscraper', 'foot', 'neck', 'roof', 'leg', 'pole', 'tall', 'dinosaur', 'tree', 'front', 'light', 'black', 'reflection', 'top', 'hour', 'eagle', 'mouth', 'animal', 'white', 'sign', 'person', 'shadow', 'next', 'big', 'line', 'stair', 'man', 'glass', 'tower', 'nest', 'balcony'] 2022-03-16 11:58:02,705.705 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'hand', 'number', 'next', 'black', 'building', 'large', 'window', 'wing', 'horse', 'sky', 'bird', 'clock', 'tail', 'statue', 'outdoor', 'nest', 'balcony', 'feathers', 'feather', 'beak'] 2022-03-16 12:00:26,506.506 2829:trainer.py:487 do_train_dict(): eta: 23:21:39 iter: 15700 speed: 303.4 images/sec total_norm: 133.9581 (138.1694) loss: 153.1067 (155.1483) masked_loss: 1.6834 (1.7285) tag_loss: 151.2865 (153.4198) time: 1.4345 (1.6877) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4291 (1.6825) save_time: 9.0279 (30.4322) lr: 0.000076 max mem: 26307 2022-03-16 12:00:26,868.868 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5625 2022-03-16 12:00:26,869.869 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 143.30093383789062 2022-03-16 12:00:26,869.869 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.7733531420744 2022-03-16 12:00:35,560.560 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018227895721793175 2022-03-16 12:00:35,560.560 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 12:00:35,561.561 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'this', 'is', 'a', '[MASK]', '##board', '##er', '[MASK]', 'a', 'dangerous', 'trick', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 12:00:35,576.576 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'tree', 'arm', 'short', 'shirt', 'man', '[UNK]', 'step', 'hand', 'stair', 'boy', 'shoe', 'railing', 'head', 'leg', 'rail', 'cloud', 'wall', 'wheel', 'mountain', 'park', 'hair', 'hat', 'person', 'ramp', 'palm', 'building', 'woman', 'top', 'skate', 'ledge', 'board', 'ground', 'air', 'bush', 'watch', 'trick', 'bench', 'grass', 'sunglasses', 'cap', 'young', 'sign', 'foot', 'girl', 'tank', 'concrete', 'face', 'roof', 'sidewalk'] 2022-03-16 12:00:51,484.484 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'top', 'park', 'short', 'hair', 'person', 'wall', 'arm', 'mountain', 'couple', 'foot', 'step', 'tree', 'sky', 'shirt', 'leg', 'dangerous', 'tank', 'wheel', 'cloud', 'pole', 'bench', 'trick', 'fence', 'shoe', 'ramp', 'skate', 'stair'] 2022-03-16 12:03:15,144.144 2829:trainer.py:487 do_train_dict(): eta: 23:19:02 iter: 15800 speed: 303.6 images/sec total_norm: 129.6254 (134.2665) loss: 150.7224 (155.0574) masked_loss: 1.7302 (1.7660) tag_loss: 148.7880 (153.2914) time: 1.4334 (1.6864) data: 0.0002 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4287 (1.6814) save_time: 9.0279 (30.4322) lr: 0.000076 max mem: 26307 2022-03-16 12:03:15,505.505 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5454545617103577 2022-03-16 12:03:15,505.505 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 170.1243896484375 2022-03-16 12:03:15,505.505 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.77492259283486 2022-03-16 12:03:24,245.245 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018213197588920593 2022-03-16 12:03:24,245.245 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 12:03:24,245.245 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'couple', 'of', 'orange', '##s', 'are', 'on', 'a', '[MASK]', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 12:03:24,260.260 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['book', 'table', 'mouse', 'orange', 'apple', 'tomato', 'paper', 'plate', 'fruit', 'background', 'stack', '[UNK]', 'magazine', 'phone', 'stem', 'bowl', 'peach', 'napkin', 'desk', 'button', 'pen', 'pile', 'computer', 'pad', 'top', 'remote', 'wire', 'spoon', 'glass', 'cord', 'food', 'ball', 'wall', 'tray', 'laptop', 'writing', 'banana', 'screen', 'light', 'item', 'handle', 'container', 'pencil', 'reflection', 'keyboard', 'next', 'logo', 'basket', 'picture', 'case'] 2022-03-16 12:03:40,210.210 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'book', 'table', 'phone', 'couple', 'paper', 'computer', 'cell', 'background', 'orange', 'bowl', 'plate', 'fruit', 'apple', 'button', 'remote', 'stem', 'mouse', 'cloth', 'sauce', 'spoon', 'peach', 'tomato'] 2022-03-16 12:06:03,832.832 2829:trainer.py:487 do_train_dict(): eta: 23:16:26 iter: 15900 speed: 303.5 images/sec total_norm: 134.2012 (137.3562) loss: 156.0555 (156.1601) masked_loss: 1.6816 (1.7332) tag_loss: 154.2530 (154.4269) time: 1.4333 (1.6869) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4280 (1.6817) save_time: 9.0279 (30.4322) lr: 0.000076 max mem: 26307 2022-03-16 12:06:04,197.197 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6470588445663452 2022-03-16 12:06:04,197.197 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 147.95608520507812 2022-03-16 12:06:04,197.197 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.7871949672699 2022-03-16 12:06:13,055.055 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018209436908364296 2022-03-16 12:06:13,056.056 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 12:06:13,056.056 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'young', 'boy', 'holding', 'a', 'skate', '##board', 'on', '[MASK]', 'side', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 12:06:13,071.071 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'shirt', 'grass', 'wheel', 'sidewalk', 'car', 'boy', 'arm', 'shoe', 'shadow', 'hand', 'ground', 'design', 'pad', 'head', 'street', 'window', 'road', 'face', 'tree', 'hair', 'helmet', 'curb', 'band', 'man', 'building', 'truck', 'person', 'glove', 'light', 'tire', 'plate', 'door', 'nose', 'park', 'eye', 'strap', 'leaf', 'license', 'jean', 'mouth', 'house', 'pole', 'bush', 'wall', 'fence', 'logo', 'line', 'elbow', 'board'] 2022-03-16 12:06:29,001.001 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'house', 'hand', 'side', 'face', 'building', 'door', 'road', 'street', 'young', 'light', 'car', 'ground', 'hair', 'arm', 'boy', 'eye', 'walk', 'window', 'tree', 'sky', 'shirt', 'shadow', 'wheel', 'grass', 'tail', 'shoe', 'sidewalk', 'glove'] 2022-03-16 12:08:52,669.669 2829:trainer.py:487 do_train_dict(): eta: 23:13:49 iter: 16000 speed: 303.3 images/sec total_norm: 130.1435 (133.7759) loss: 152.1563 (154.6496) masked_loss: 1.7270 (1.7722) tag_loss: 150.6622 (152.8774) time: 1.4337 (1.6883) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4287 (1.6832) save_time: 9.0279 (30.4322) lr: 0.000076 max mem: 26307 2022-03-16 12:08:53,029.029 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5135135054588318 2022-03-16 12:08:53,029.029 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 165.96426391601562 2022-03-16 12:08:53,029.029 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.81289227408652 2022-03-16 12:09:01,889.889 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01821252889931202 2022-03-16 12:09:01,889.889 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 12:09:01,889.889 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'is', 'being', '[MASK]', 'slicing', 'a', 'birthday', '[MASK]', 'while', 'another', 'man', 'is', 'staring', '[MASK]', 'him', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 12:09:01,905.905 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['man', 'wall', 'shirt', 'head', 'hand', 'hair', 'ear', 'flower', 'jacket', 'glasses', 'woman', 'person', '[UNK]', 'floor', 'carpet', 'jean', 'room', 'box', 'coat', 'shoe', 'table', 'cake', 'face', 'dress', 'switch', 'chair', 'plate', 'light', 'outlet', 'heart', 'arm', 'sign', 'tag', 'fork', 'knife', 'belt', 'leg', 'paper', 'can', 'bottle', 'watch', 'suit', 'tie', 'bag', 'group', 'stick', 'purse', 'rug', 'camera', 'next'] 2022-03-16 12:09:17,812.812 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'hand', 'room', 'woman', 'hair', 'person', 'table', 'wall', 'chair', 'jean', 'shirt', 'ear', 'camera', 'plate', 'knife', 'birthday', 'flower', 'jacket', 'fork', 'cake', 'carpet', 'rug'] 2022-03-16 12:11:41,525.525 2829:trainer.py:487 do_train_dict(): eta: 23:11:13 iter: 16100 speed: 303.2 images/sec total_norm: 131.1263 (132.7787) loss: 155.3250 (156.2789) masked_loss: 1.7928 (1.7767) tag_loss: 153.0792 (154.5023) time: 1.4338 (1.6886) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4289 (1.6834) save_time: 9.0279 (30.4322) lr: 0.000076 max mem: 26307 2022-03-16 12:11:41,886.886 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-16 12:11:41,886.886 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 147.29998779296875 2022-03-16 12:11:41,886.886 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.80913713243272 2022-03-16 12:11:50,847.847 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01819959282875061 2022-03-16 12:11:50,847.847 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 12:11:50,847.847 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'male', 'tennis', 'player', '[MASK]', 'leaned', '[MASK]', 'in', 'a', 'position', 'preparing', 'for', 'the', '[MASK]', 'players', 'serve', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 12:11:50,863.863 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'tennis', 'shirt', 'man', 'shoe', 'court', 'short', 'head', 'ground', 'leg', 'hand', 'band', 'sock', 'hair', 'arm', 'player', 'handle', 'line', 'footprint', 'string', 'ball', 'ear', 'stripe', 'shadow', 'blue', 'dirt', 'face', 'wrist', 'sleeve', 'person', 'male', 'foot', 'outfit', 'hat', 'top', 'glove', 'track', 'net', 'tape', 'cap', 'knee', 'clay', 'ready', 'logo', 'letter', 'air', 'game', 'surface', 'action', 'shot'] 2022-03-16 12:12:06,758.758 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'other', 'head', 'man', 'hand', 'band', 'player', 'court', 'short', 'position', 'ground', 'hair', 'track', 'arm', 'male', 'shirt', 'leg', 'handle', 'tennis', 'shoe', 'sock'] 03-16 12:12:35.528 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 12:12:35.528 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 12:12:36.599 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 94}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 94}] 2022-03-16 12:14:30,343.343 2829:trainer.py:487 do_train_dict(): eta: 23:08:36 iter: 16200 speed: 303.3 images/sec total_norm: 130.7571 (133.1061) loss: 152.3263 (152.8713) masked_loss: 1.6009 (1.6793) tag_loss: 150.9995 (151.1920) time: 1.4324 (1.6882) data: 0.0001 (0.0001) to_device: 0.0051 (0.0051) time_gpu: 1.4272 (1.6829) save_time: 9.0279 (30.4322) lr: 0.000076 max mem: 26307 2022-03-16 12:14:30,704.704 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6666666865348816 2022-03-16 12:14:30,705.705 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 161.28192138671875 2022-03-16 12:14:30,705.705 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.80887898638204 2022-03-16 12:14:39,654.654 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018146023154258728 2022-03-16 12:14:39,654.654 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 12:14:39,654.654 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'and', '[MASK]', 'sitting', 'next', 'to', 'each', 'other', 'on', '[MASK]', 'park', 'bench', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 12:14:39,670.670 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['building', 'step', 'umbrella', 'shirt', 'ground', 'fence', 'person', 'window', 'bottle', 'jean', 'stair', 'man', 'grass', 'tree', 'woman', 'shoe', '[UNK]', 'bag', 'dog', 'bench', 'head', 'roof', 'hat', 'sidewalk', 'pole', 'hair', 'foot', 'book', 'couple', 'leg', 'hand', 'park', 'wall', 'cap', 'sky', 'shadow', 'curb', 'railing', 'dirt', 'watch', 'short', 'cup', 'paper', 'bird', 'backpack', 'rock', 'water', 'purse', 'lady', 'newspaper'] 2022-03-16 12:14:55,621.621 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['other', 'city', 'man', 'hand', 'next', 'building', 'book', 'road', 'park', 'woman', 'ground', 'hair', 'person', 'couple', 'window', 'step', 'tree', 'watch', 'jean', 'shirt', 'leg', 'bag', 'shadow', 'palm', 'grass', 'bottle', 'pole', 'bench', 'fence', 'shoe', 'trash', 'sidewalk', 'umbrella', 'curb', 'stair'] 2022-03-16 12:17:19,292.292 2829:trainer.py:487 do_train_dict(): eta: 23:05:59 iter: 16300 speed: 303.1 images/sec total_norm: 133.7826 (137.7139) loss: 152.2081 (153.8290) masked_loss: 1.7153 (1.6968) tag_loss: 150.0632 (152.1322) time: 1.4327 (1.6895) data: 0.0001 (0.0002) to_device: 0.0052 (0.0051) time_gpu: 1.4273 (1.6842) save_time: 9.0279 (30.4322) lr: 0.000075 max mem: 26307 2022-03-16 12:17:19,653.653 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5 2022-03-16 12:17:19,653.653 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 145.068115234375 2022-03-16 12:17:19,653.653 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.83430718212593 2022-03-16 12:17:28,731.731 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018120521679520607 2022-03-16 12:17:28,732.732 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 12:17:28,732.732 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'blue', 'train', '[MASK]', 'outside', 'of', '[MASK]', 'train', 'station', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 12:17:28,748.748 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['train', 'sky', 'window', 'track', 'light', '[UNK]', 'platform', 'sidewalk', 'windshield', 'front', 'tree', 'gravel', 'logo', 'bumper', 'vent', 'pole', 'door', 'line', 'building', 'station', 'ground', 'roof', 'number', 'sign', 'car', 'person', 'fence', 'shirt', 'wire', 'man', 'cloud', 'blue', 'ladder', 'grass', 'engine', 'white', 'post', 'horn', 'stripe', 'bush', 'woman', 'writing', 'stop', 'railroad', 'bus', 'flag', 'wheel', 'passenger', 'wall', 'lamp'] 2022-03-16 12:17:44,731.731 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'station', 'top', 'door', 'road', 'front', 'light', 'car', 'ground', 'blue', 'track', 'person', 'wall', 'window', 'train', 'sky', 'shirt', 'platform', 'pole', 'logo', 'fence', 'gravel', 'sidewalk', 'vent', 'bumper'] 2022-03-16 12:20:08,210.210 2829:trainer.py:487 do_train_dict(): eta: 23:03:22 iter: 16400 speed: 303.1 images/sec total_norm: 130.1776 (134.3516) loss: 152.4139 (153.4830) masked_loss: 1.7335 (1.7460) tag_loss: 150.4457 (151.7370) time: 1.4329 (1.6892) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4277 (1.6840) save_time: 9.0279 (30.4322) lr: 0.000075 max mem: 26307 2022-03-16 12:20:08,572.572 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4000000059604645 2022-03-16 12:20:08,572.572 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 144.09503173828125 2022-03-16 12:20:08,572.572 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.85548077207623 2022-03-16 12:20:17,604.604 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018132081255316734 2022-03-16 12:20:17,604.604 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 12:20:17,605.605 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'kite', 'flies', 'over', 'a', 'small', 'group', '[MASK]', '[MASK]', 'by', 'the', 'water', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 12:20:17,620.620 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'kite', 'tree', 'water', 'cloud', 'boat', 'person', 'tail', 'man', '[UNK]', 'beach', 'shirt', 'lake', 'shadow', 'string', 'sand', 'child', 'hat', 'grass', 'balloon', 'short', 'head', 'woman', 'building', 'house', 'shore', 'chair', 'group', 'jacket', 'umbrella', 'hair', 'ribbon', 'car', 'large', 'leg', 'boy', 'ground', 'distance', 'body', 'sail', 'air', 'colorful', 'ocean', 'forest', 'rope', 'day', 'rock', 'park', 'jean', 'bag'] 2022-03-16 12:20:33,525.525 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['man', 'group', 'small', 'water', 'person', 'tree', 'beach', 'sky', 'boat', 'sand', 'tail', 'cloud', 'fence', 'log', 'ribbon', 'kite'] 2022-03-16 12:22:57,176.176 2829:trainer.py:487 do_train_dict(): eta: 23:00:45 iter: 16500 speed: 303.0 images/sec total_norm: 130.1838 (133.3996) loss: 153.4620 (154.6140) masked_loss: 1.7177 (1.7425) tag_loss: 151.7218 (152.8715) time: 1.4323 (1.6897) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4273 (1.6847) save_time: 9.0279 (30.4322) lr: 0.000075 max mem: 26307 2022-03-16 12:22:57,538.538 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5128205418586731 2022-03-16 12:22:57,538.538 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 168.87037658691406 2022-03-16 12:22:57,538.538 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.82732795807253 2022-03-16 12:23:06,664.664 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01812019571661949 2022-03-16 12:23:06,664.664 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 12:23:06,665.665 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'the', 'people', 'are', 'waiting', 'outside', '[MASK]', 'their', 'bicycles', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 12:23:06,680.680 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['person', 'shirt', 'hair', 'building', 'man', 'woman', 'hat', '[UNK]', 'backpack', 'jean', 'sign', 'window', 'bicycle', 'sky', 'sidewalk', 'cap', 'jacket', 'bike', 'head', 'street', 'bag', 'wall', 'short', 'shoe', 'tree', 'pole', 'glasses', 'hand', 'sweater', 'sunglasses', 'door', 'store', 'letter', 'group', 'shadow', 'skirt', 'road', 'tire', 'ground', 'car', 'girl', 'city', 'bottle', 'roof', 'line', 'purse', 'light', 'boy', 'helmet', 'boot'] 2022-03-16 12:23:22,623.623 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'building', 'street', 'woman', 'hair', 'person', 'seat', 'arm', 'boy', 'phone', 'tree', 'letter', 'sign', 'sky', 'jean', 'shirt', 'palm', 'bottle', 'hat', 'cap', 'jacket', 'glasses', 'bike', 'boot', 'bicycle', 'basket', 'tire', 'backpack', 'sunglasses'] 2022-03-16 12:25:46,282.282 2829:trainer.py:487 do_train_dict(): eta: 22:58:09 iter: 16600 speed: 302.8 images/sec total_norm: 130.7999 (132.8385) loss: 155.6601 (155.2399) masked_loss: 1.8208 (1.8244) tag_loss: 153.3624 (153.4155) time: 1.4332 (1.6910) data: 0.0001 (0.0005) to_device: 0.0051 (0.0049) time_gpu: 1.4281 (1.6856) save_time: 9.0279 (30.4322) lr: 0.000075 max mem: 26307 2022-03-16 12:25:46,643.643 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6285714507102966 2022-03-16 12:25:46,643.643 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 147.41220092773438 2022-03-16 12:25:46,643.643 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.84228118165524 2022-03-16 12:25:55,816.816 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.0181451216340065 2022-03-16 12:25:55,816.816 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 12:25:55,816.816 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'police', '[MASK]', 'walking', 'two', 'bikes', 'down', 'the', 'street', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 12:25:55,831.831 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['building', '[UNK]', 'shirt', 'man', 'bike', 'bicycle', 'pot', 'shoe', 'umbrella', 'sidewalk', 'head', 'hand', 'flower', 'wall', 'plant', 'bag', 'woman', 'sign', 'basket', 'window', 'person', 'street', 'hat', 'door', 'wheel', 'hair', 'pipe', 'tire', 'pole', 'uniform', 'curb', 'strap', 'purse', 'road', 'apron', 'helmet', 'backpack', 'arm', 'belt', 'picture', 'jacket', 'leg', 'cap', 'poster', 'plate', 'brick', 'chair', 'city', 'light', 'flag'] 2022-03-16 12:26:11,745.745 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'building', 'door', 'street', 'woman', 'hair', 'police', 'person', 'wall', 'arm', 'officer', 'plant', 'window', 'shirt', 'bag', 'wheel', 'hat', 'uniform', 'pole', 'flower', 'bike', 'pipe', 'pot', 'bicycle', 'basket', 'shoe', 'curtain', 'sidewalk', 'tire', 'umbrella', 'poster', 'curb', 'strap'] 2022-03-16 12:28:35,604.604 2829:trainer.py:487 do_train_dict(): eta: 22:55:33 iter: 16700 speed: 302.4 images/sec total_norm: 130.0176 (132.9274) loss: 151.7409 (152.8994) masked_loss: 1.6811 (1.7035) tag_loss: 149.9599 (151.1959) time: 1.4324 (1.6932) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4275 (1.6881) save_time: 9.0279 (30.4322) lr: 0.000075 max mem: 26307 2022-03-16 12:28:35,964.964 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6000000238418579 2022-03-16 12:28:35,965.965 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 137.01580810546875 2022-03-16 12:28:35,965.965 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.85503101348877 2022-03-16 12:28:45,194.194 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018143733963370323 2022-03-16 12:28:45,195.195 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 12:28:45,195.195 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'close', 'up', 'of', '[MASK]', 'elephant', 'with', 'one', '[MASK]', 'open', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 12:28:45,210.210 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['elephant', 'ear', 'eye', 'head', 'trunk', 'face', 'skin', 'leg', 'body', '[UNK]', 'mouth', 'back', 'close', 'forehead', 'other', 'hair', 'line', 'wall', 'large', 'next', 'herd', 'adult', 'side', 'gray', 'rock', 'grass', 'tree', 'couple', 'camera', 'name', 'area', 'big', 'standing', 'picture', 'field', 'brown', 'water', 'small', 'green', 'open', 'ground', 'tongue', 'view', 'neck', 'grey', 'arm', 'tail', 'baby', 'pair', 'wild'] 2022-03-16 12:29:01,092.092 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'face', 'open', 'skin', 'eye', 'ear', 'tail', 'trunk', 'elephant'] 2022-03-16 12:31:24,721.721 2829:trainer.py:487 do_train_dict(): eta: 22:52:56 iter: 16800 speed: 302.8 images/sec total_norm: 130.6526 (133.5937) loss: 156.5758 (155.5780) masked_loss: 1.7173 (1.7069) tag_loss: 154.6486 (153.8711) time: 1.4330 (1.6912) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4277 (1.6860) save_time: 9.0279 (30.4322) lr: 0.000075 max mem: 26307 2022-03-16 12:31:25,082.082 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.59375 2022-03-16 12:31:25,082.082 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 138.40505981445312 2022-03-16 12:31:25,083.083 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.85405857605342 2022-03-16 12:31:34,339.339 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01817173697054386 2022-03-16 12:31:34,339.339 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 12:31:34,340.340 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'man', 'standing', 'by', 'shore', 'in', 'ocean', '[MASK]', 'a', 'surf', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 12:31:34,355.355 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'water', 'hair', 'hand', 'head', 'man', 'sky', 'wave', 'arm', 'logo', 'beach', 'building', 'leg', 'board', 'suit', 'hill', 'house', 'wet', 'sand', 'foot', 'shore', 'face', 'person', 'rock', 'ocean', 'surfer', 'mountain', 'cliff', 'mouth', 'reflection', 'surf', 'strap', 'cord', 'cloud', 'tree', 'ear', 'background', 'foam', 'watch', 'design', 'grass', 'line', 'tower', 'rope', 'fin', 'nose', 'roof', 'stripe', 'shirt', 'shoe'] 2022-03-16 12:31:50,340.340 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'line', 'water', 'building', 'board', 'hair', 'mouth', 'arm', 'mountain', 'tree', 'beach', 'sky', 'ocean', 'leg', 'wave', 'ear', 'suit', 'wet', 'shore', 'sand', 'logo', 'reflection', 'strap', 'foam'] 2022-03-16 12:34:14,266.266 2829:trainer.py:487 do_train_dict(): eta: 22:50:20 iter: 16900 speed: 302.0 images/sec total_norm: 131.1679 (133.4499) loss: 151.9933 (152.8905) masked_loss: 1.6240 (1.6615) tag_loss: 150.5166 (151.2290) time: 1.4344 (1.6955) data: 0.0001 (0.0002) to_device: 0.0052 (0.0051) time_gpu: 1.4290 (1.6902) save_time: 9.0279 (30.4322) lr: 0.000075 max mem: 26307 2022-03-16 12:34:14,625.625 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4571428596973419 2022-03-16 12:34:14,626.626 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 167.4562225341797 2022-03-16 12:34:14,626.626 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.8652432161219 2022-03-16 12:34:23,882.882 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018175462260842323 2022-03-16 12:34:23,882.882 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 12:34:23,882.882 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'the', 'sail', '##boat', 'ic', '##ome', '##s', 'close', 'to', '[MASK]', '[MASK]', 'people', 'can', 'see', 'each', 'other', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 12:34:23,897.897 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['water', 'person', 'sky', 'boat', 'rock', 'sail', 'bush', 'shirt', '[UNK]', 'reflection', 'beach', 'man', 'bottom', 'shore', 'number', 'tree', 'ocean', 'wave', 'mast', 'woman', 'top', 'head', 'shoreline', 'small', 'deck', 'motor', 'ground', 'wake', 'lake', 'grass', 'horizon', 'couple', 'land', 'bird', 'front', 'ripple', 'cross', 'hat', 'pole', 'group', 'body', 'mountain', 'white', 'sand', 'flag', 'hair', 'base', 'dirt', 'jacket', 'window'] 2022-03-16 12:34:39,869.869 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'other', 'man', 'number', 'water', 'ground', 'rock', 'person', 'tree', 'beach', 'sky', 'shirt', 'bottom', 'boat', 'ocean', 'wave', 'bush', 'reflection', 'sail', 'mast', 'kite'] 2022-03-16 12:37:03,559.559 2829:trainer.py:487 do_train_dict(): eta: 22:47:44 iter: 17000 speed: 302.4 images/sec total_norm: 129.8390 (134.4609) loss: 153.5485 (154.0217) masked_loss: 1.7482 (1.7960) tag_loss: 152.1056 (152.2257) time: 1.4337 (1.6929) data: 0.0001 (0.0002) to_device: 0.0050 (0.0050) time_gpu: 1.4287 (1.6877) save_time: 9.0279 (30.4322) lr: 0.000074 max mem: 26307 2022-03-16 12:37:03,919.919 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5789473652839661 2022-03-16 12:37:03,919.919 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 161.56826782226562 2022-03-16 12:37:03,919.919 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.8867988363344 2022-03-16 12:37:13,569.569 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018235348165035248 2022-03-16 12:37:13,570.570 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 12:37:13,570.570 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'three', 'men', '[MASK]', 'the', '[MASK]', 'with', '[MASK]', 'object', 'sailing', 'through', 'the', 'air', 'between', 'two', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 12:37:13,586.586 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'hand', '[UNK]', 'grass', 'shirt', 'arm', 'hat', 'man', 'leg', 'ground', 'cap', 'head', 'sky', 'shoe', 'jacket', 'person', 'bush', 'rock', 'forest', 'wood', 'path', 'woman', 'hill', 'face', 'hair', 'sunglasses', 'short', 'trail', 'dirt', 'sweatshirt', 'glasses', 'backpack', 'trunk', 'area', 'sleeve', 'sweater', 'boot', 'plant', 'bag', 'foot', 'jean', 'stick', 'pole', 'young', 'water', 'field', 'watch', 'air', 'branch', 'side'] 2022-03-16 12:37:29,527.527 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'air', 'woman', 'ground', 'rock', 'hair', 'person', 'arm', 'tree', 'watch', 'sky', 'shirt', 'path', 'leg', 'ear', 'object', 'grass', 'hat', 'cap', 'jacket', 'wrist', 'glasses', 'shoe'] 2022-03-16 12:39:52,995.995 2829:trainer.py:487 do_train_dict(): eta: 22:45:07 iter: 17100 speed: 302.2 images/sec total_norm: 132.8160 (137.0210) loss: 155.5006 (154.4930) masked_loss: 1.7396 (1.7455) tag_loss: 153.5885 (152.7476) time: 1.4322 (1.6944) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4271 (1.6893) save_time: 9.0279 (30.4322) lr: 0.000074 max mem: 26307 2022-03-16 12:39:53,356.356 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4848484992980957 2022-03-16 12:39:53,357.357 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 149.45701599121094 2022-03-16 12:39:53,357.357 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.9060405908629 2022-03-16 12:40:02,738.738 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018234478309750557 2022-03-16 12:40:02,738.738 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 12:40:02,738.738 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'man', 'sitting', 'on', 'top', 'of', 'a', '[MASK]', 'ledge', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 12:40:02,753.753 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'wheel', 'hand', 'man', 'rock', 'arm', 'ground', 'ear', 'hair', 'head', 'leg', 'person', 'board', 'wall', 'shoe', 'poster', 'face', 'shadow', 'shirt', 'bracelet', 'sheep', 'tree', 'nose', 'short', 'back', 'tattoo', 'road', 'water', 'background', 'wrist', 'foot', 'mouth', 'beach', 'skate', 'picture', 'leaf', 'glasses', 'young', 'strap', 'top', 'logo', 'hat', 'sidewalk', 'eye', 'sunglasses', 'car', 'band', 'grass', 'jean', 'design'] 2022-03-16 12:40:18,679.679 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'building', 'top', 'ground', 'rock', 'wall', 'arm', 'paper', 'shirt', 'animal', 'leg', 'nose', 'ear', 'camera', 'wheel', 'hat', 'cap', 'glasses', 'sheep', 'cement', 'strap', 'ledge', 'bracelet'] 03-16 12:42:36.697 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 12:42:36.697 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 12:42:38.017 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 97}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 12:42:42,089.089 2829:trainer.py:487 do_train_dict(): eta: 22:42:30 iter: 17200 speed: 302.8 images/sec total_norm: 134.4014 (136.8270) loss: 152.6346 (152.4155) masked_loss: 1.7752 (1.7952) tag_loss: 151.0208 (150.6203) time: 1.4315 (1.6909) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4262 (1.6858) save_time: 9.0279 (30.4322) lr: 0.000074 max mem: 26307 2022-03-16 12:42:42,453.453 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-16 12:42:42,453.453 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 175.4453125 2022-03-16 12:42:42,453.453 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.89723540730559 2022-03-16 12:42:51,899.899 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018246429041028023 2022-03-16 12:42:51,899.899 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 12:42:51,899.899 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'cat', 'looking', 'angry', 'as', 'it', 'is', '[MASK]', 'on', 'top', 'of', 'a', 'laptop', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 12:42:51,914.914 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['cat', 'keyboard', 'ear', 'head', 'eye', 'laptop', '[UNK]', 'paw', 'key', 'nose', 'face', 'screen', 'computer', 'desk', 'black', 'button', 'table', 'wall', 'leg', 'paper', 'cord', 'tail', 'logo', 'mouse', 'person', 'top', 'pad', 'floor', 'book', 'next', 'light', 'white', 'cloth', 'carpet', 'bed', 'monitor', 'foot', 'chair', 'front', 'writing', 'kitten', 'speaker', 'pen', 'bottle', 'fur', 'window', 'shelf', 'bag', 'animal', 'lap'] 2022-03-16 12:43:07,852.852 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'face', 'top', 'cup', 'person', 'table', 'phone', 'key', 'eye', 'chair', 'paper', 'cell', 'shirt', 'screen', 'nose', 'bag', 'ear', 'desk', 'angry', 'cat', 'tail', 'bottle', 'keyboard', 'cord', 'laptop'] 2022-03-16 12:45:31,236.236 2829:trainer.py:487 do_train_dict(): eta: 22:39:52 iter: 17300 speed: 302.7 images/sec total_norm: 132.8506 (135.2566) loss: 155.6959 (154.1799) masked_loss: 1.6586 (1.7094) tag_loss: 154.3548 (152.4705) time: 1.4321 (1.6916) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4270 (1.6864) save_time: 9.0279 (30.4322) lr: 0.000074 max mem: 26307 2022-03-16 12:45:31,596.596 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5714285969734192 2022-03-16 12:45:31,597.597 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 146.93707275390625 2022-03-16 12:45:31,597.597 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.91809915126055 2022-03-16 12:45:41,103.103 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01827346906065941 2022-03-16 12:45:41,103.103 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 12:45:41,104.104 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'young', 'man', 'holding', 'a', '[MASK]', 'knife', '[MASK]', '[MASK]', 'right', 'hand', 'is', 'posing', 'for', 'a', 'snap', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 12:45:41,119.119 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['eye', 'ear', 'tie', 'man', 'face', 'nose', 'hair', 'shirt', 'hand', 'head', 'wall', 'collar', 'teeth', 'suit', 'mouth', 'blade', 'knife', 'chin', 'eyebrow', 'smile', 'finger', 'jacket', 'neck', 'handle', 'scissors', '[UNK]', 'arm', 'ring', 'forehead', 'sleeve', 'knot', 'stripe', 'wrist', 'watch', 'thumb', 'coat', 'cuff', 'paper', 'white', 'button', 'blue', 'black', 'smiling', 'guy', 'picture', 'sword', 'young', 'dot', 'person', 'woman'] 2022-03-16 12:45:57,102.102 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'right', 'man', 'hand', 'face', 'young', 'hair', 'mouth', 'wall', 'arm', 'smile', 'eye', 'neck', 'shirt', 'teeth', 'finger', 'nose', 'ear', 'sharp', 'suit', 'chin', 'knife', 'tie', 'collar', 'snap', 'knot'] 2022-03-16 12:48:20,618.618 2829:trainer.py:487 do_train_dict(): eta: 22:37:15 iter: 17400 speed: 302.3 images/sec total_norm: 133.6412 (135.6275) loss: 153.5819 (154.5017) masked_loss: 1.7322 (1.7640) tag_loss: 152.0323 (152.7377) time: 1.4331 (1.6937) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4278 (1.6886) save_time: 9.0279 (30.4322) lr: 0.000074 max mem: 26307 2022-03-16 12:48:20,979.979 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.59375 2022-03-16 12:48:20,979.979 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 169.4862060546875 2022-03-16 12:48:20,979.979 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.90597978864398 2022-03-16 12:48:30,443.443 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018285930156707764 2022-03-16 12:48:30,443.443 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 12:48:30,443.443 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'benches', 'are', 'facing', '[MASK]', 'water', 'surrounded', 'by', 'leaves', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 12:48:30,458.458 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'leaf', 'ground', 'park', 'bench', 'water', 'reflection', 'trunk', 'grass', 'leg', 'curb', 'pond', '[UNK]', 'puddle', 'lake', 'stream', 'area', 'sky', 'arm', 'forest', 'red', 'branch', 'pole', 'road', 'person', 'pool', 'flower', 'wall', 'seat', 'back', 'rock', 'bank', 'background', 'wooden', 'fence', 'path', 'light', 'foliage', 'fall', 'couple', 'head', 'pavement', 'top', 'trash', 'next', 'middle', 'sign', 'paint', 'bush', 'front'] 2022-03-16 12:48:46,400.400 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'water', 'park', 'ground', 'arm', 'tree', 'leg', 'background', 'grass', 'bench', 'leaf', 'trunk', 'reflection', 'curb'] 2022-03-16 12:51:09,802.802 2829:trainer.py:487 do_train_dict(): eta: 22:34:38 iter: 17500 speed: 302.6 images/sec total_norm: 130.7618 (133.5605) loss: 151.8541 (152.2430) masked_loss: 1.7109 (1.7954) tag_loss: 150.0105 (150.4476) time: 1.4323 (1.6918) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4272 (1.6868) save_time: 9.0279 (30.4322) lr: 0.000074 max mem: 26307 2022-03-16 12:51:10,163.163 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5151515007019043 2022-03-16 12:51:10,163.163 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 161.96124267578125 2022-03-16 12:51:10,164.164 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.90612610903653 2022-03-16 12:51:19,676.676 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018312757834792137 2022-03-16 12:51:19,677.677 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 12:51:19,677.677 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'vehicles', 'and', '[MASK]', '[MASK]', 'busy', 'urban', 'city', 'setting', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 12:51:19,693.693 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['building', 'street', 'man', 'bus', 'person', '[UNK]', 'road', 'sidewalk', 'pole', 'motorcycle', 'sign', 'license', 'windshield', 'plate', 'car', 'window', 'van', 'number', 'jacket', 'helmet', 'city', 'shirt', 'tire', 'light', 'woman', 'bag', 'line', 'traffic', 'mirror', 'busy', 'bike', 'trash', 'backpack', 'can', 'decker', 'jean', 'vest', 'truck', 'shoe', 'coat', 'double', 'driver', 'curb', 'vehicle', 'hair', 'tree', 'sky', 'head', 'stop', 'purse'] 2022-03-16 12:51:35,599.599 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'city', 'man', 'number', 'building', 'road', 'street', 'woman', 'car', 'person', 'van', 'window', 'sign', 'shirt', 'bus', 'urban', 'setting', 'bag', 'plate', 'busy', 'license', 'pole', 'jacket', 'bike', 'motorcycle', 'helmet', 'shoe', 'sidewalk', 'tire', 'windshield'] 2022-03-16 12:53:59,165.165 2829:trainer.py:487 do_train_dict(): eta: 22:32:00 iter: 17600 speed: 302.3 images/sec total_norm: 133.8442 (135.4127) loss: 149.4632 (150.1285) masked_loss: 1.7335 (1.7827) tag_loss: 147.5871 (148.3459) time: 1.4327 (1.6936) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4275 (1.6886) save_time: 9.0279 (30.4322) lr: 0.000073 max mem: 26307 2022-03-16 12:53:59,525.525 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6000000238418579 2022-03-16 12:53:59,526.526 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 149.092041015625 2022-03-16 12:53:59,526.526 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.9159889436711 2022-03-16 12:54:09,065.065 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018323233351111412 2022-03-16 12:54:09,066.066 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 12:54:09,066.066 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'large', '[MASK]', 'of', 'people', 'riding', 'motorized', 'bicycles', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 12:54:09,082.082 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['person', 'motorcycle', 'man', 'tree', 'shirt', 'building', 'bike', 'street', 'car', 'helmet', 'light', 'road', 'tire', '[UNK]', 'woman', 'sign', 'hat', 'group', 'wheel', 'head', 'window', 'city', 'short', 'bicycle', 'pole', 'hair', 'sidewalk', 'bush', 'traffic', 'jacket', 'night', 'dress', 'shoe', 'line', 'suv', 'license', 'bag', 'mirror', 'parade', 'arm', 'sunglasses', 'busy', 'sky', 'balcony', 'bunch', 'hand', 'backpack', 'banner', 'truck', 'background'] 2022-03-16 12:54:24,975.975 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['man', 'group', 'building', 'large', 'top', 'road', 'street', 'light', 'woman', 'cup', 'short', 'car', 'person', 'tree', 'shirt', 'tank', 'wheel', 'hat', 'bike', 'barrel', 'motorcycle', 'helmet', 'shoe', 'sidewalk', 'tire', 'cone', 'backpack', 'curb', 'motorized'] 2022-03-16 12:56:48,669.669 2829:trainer.py:487 do_train_dict(): eta: 22:29:23 iter: 17700 speed: 302.1 images/sec total_norm: 132.1799 (134.3531) loss: 152.4666 (152.4722) masked_loss: 1.6963 (1.7527) tag_loss: 150.6600 (150.7195) time: 1.4339 (1.6951) data: 0.0001 (0.0005) to_device: 0.0050 (0.0048) time_gpu: 1.4288 (1.6897) save_time: 9.0279 (30.4322) lr: 0.000073 max mem: 26307 2022-03-16 12:56:49,029.029 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.65625 2022-03-16 12:56:49,029.029 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 133.92584228515625 2022-03-16 12:56:49,029.029 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.93238937720824 2022-03-16 12:56:58,642.642 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018357407301664352 2022-03-16 12:56:58,642.642 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 12:56:58,643.643 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'parked', '[MASK]', 'sitting', 'next', 'to', 'a', 'white', 'brick', 'wall', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 12:56:58,658.658 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'mirror', 'motorcycle', 'bike', '[UNK]', 'hand', 'building', 'light', 'finger', 'handle', 'person', 'reflection', 'brick', 'door', 'man', 'window', 'seat', 'camera', 'front', 'shirt', 'button', 'helmet', 'nail', 'arm', 'jacket', 'pipe', 'wheel', 'glass', 'shadow', 'thumb', 'head', 'watch', 'sleeve', 'street', 'ceiling', 'side', 'road', 'pole', 'face', 'ring', 'tank', 'red', 'view', 'frame', 'wrist', 'top', 'hair', 'line', 'license', 'gas'] 2022-03-16 12:57:14,670.670 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'side', 'line', 'next', 'building', 'white', 'door', 'light', 'wall', 'glass', 'handle', 'mirror', 'brick', 'pole', 'bike', 'motorcycle', 'parked'] 2022-03-16 12:59:38,123.123 2829:trainer.py:487 do_train_dict(): eta: 22:26:46 iter: 17800 speed: 302.2 images/sec total_norm: 130.6366 (134.1097) loss: 151.3086 (150.3721) masked_loss: 1.6658 (1.6862) tag_loss: 149.1745 (148.6859) time: 1.4326 (1.6945) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4275 (1.6895) save_time: 9.0279 (30.4322) lr: 0.000073 max mem: 26307 2022-03-16 12:59:38,482.482 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6666666865348816 2022-03-16 12:59:38,483.483 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 174.25140380859375 2022-03-16 12:59:38,483.483 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.9307282090853 2022-03-16 12:59:48,142.142 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018369663506746292 2022-03-16 12:59:48,143.143 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 12:59:48,143.143 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'group', 'of', 'women', '[MASK]', 'next', 'to', 'each', 'other', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 12:59:48,158.158 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hair', 'shirt', 'man', 'head', 'person', 'cake', 'hand', 'woman', '[UNK]', 'shelf', 'table', 'cup', 'phone', 'plate', 'ear', 'face', 'napkin', 'girl', 'arm', 'kitchen', 'cell', 'hat', 'glasses', 'sweater', 'picture', 'bowl', 'laptop', 'apron', 'glass', 'light', 'bottle', 'screen', 'restaurant', 'food', 'camera', 'mouth', 'logo', 'design', 'top', 'nose', 'spoon', 'lady', 'beard', 'towel', 'wall', 'ceiling', 'pot', 'chair', 'oven', 'container'] 2022-03-16 13:00:04,096.096 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'other', 'head', 'man', 'group', 'hand', 'next', 'woman', 'cup', 'hair', 'girl', 'person', 'table', 'arm', 'phone', 'shirt', 'drink', 'plate', 'bottle', 'pan', 'cloth', 'cake', 'shelf', 'laptop', 'sweater', 'scarf', 'napkin'] 2022-03-16 13:02:27,583.583 2829:trainer.py:487 do_train_dict(): eta: 22:24:09 iter: 17900 speed: 302.1 images/sec total_norm: 132.8371 (135.1002) loss: 153.2622 (153.0360) masked_loss: 1.7467 (1.7355) tag_loss: 151.5882 (151.3004) time: 1.4324 (1.6946) data: 0.0001 (0.0002) to_device: 0.0052 (0.0051) time_gpu: 1.4271 (1.6894) save_time: 9.0279 (30.4322) lr: 0.000073 max mem: 26307 2022-03-16 13:02:27,944.944 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6774193644523621 2022-03-16 13:02:27,945.945 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 167.03985595703125 2022-03-16 13:02:27,945.945 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.92331682840982 2022-03-16 13:02:37,697.697 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018341748043894768 2022-03-16 13:02:37,698.698 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 13:02:37,698.698 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'public', 'bus', 'stopping', 'with', 'mermaid', "'", 's', 'doors', 'open', 'at', 'night', 'time', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 13:02:37,713.713 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['light', 'window', 'windshield', '[UNK]', 'sign', 'road', 'ceiling', 'bus', 'front', 'street', 'number', 'door', 'bumper', 'sidewalk', 'wheel', 'building', 'tire', 'line', 'roof', 'stripe', 'sky', 'pole', 'car', 'plate', 'license', 'station', 'curb', 'night', 'rack', 'logo', 'person', 'bike', 'track', 'advertisement', 'mirror', 'wall', 'ground', 'man', 'snow', 'letter', 'fence', 'shadow', 'train', 'driver', 'pillar', 'top', 'city', 'shirt', 'tree', 'bicycle'] 2022-03-16 13:02:53,712.712 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'time', 'number', 'line', 'public', 'night', 'open', 'door', 'road', 'front', 'street', 'light', 'wall', 'window', 'sign', 'bus', 'snow', 'plate', 'wheel', 'ceiling', 'license', 'sidewalk', 'tire', 'rack', 'windshield'] 2022-03-16 13:05:17,174.174 2829:trainer.py:487 do_train_dict(): eta: 22:21:31 iter: 18000 speed: 301.9 images/sec total_norm: 133.8943 (139.1579) loss: 151.1933 (152.4651) masked_loss: 1.6574 (1.6425) tag_loss: 149.9055 (150.8227) time: 1.4328 (1.6959) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4278 (1.6908) save_time: 9.0279 (30.4322) lr: 0.000073 max mem: 26307 2022-03-16 13:05:17,536.536 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5757575631141663 2022-03-16 13:05:17,537.537 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 141.84056091308594 2022-03-16 13:05:17,537.537 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.93303950188569 2022-03-16 13:05:27,313.313 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01839062198996544 2022-03-16 13:05:27,313.313 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 13:05:27,314.314 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'the', 'toilet', 'is', 'next', 'to', 'a', 'curtain', '[MASK]', 'window', '.', 'factories', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 13:05:27,329.329 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['curtain', 'wall', 'window', 'towel', 'bathroom', '[UNK]', 'floor', 'toilet', 'door', 'seat', 'rod', 'light', 'sink', 'shower', 'cabinet', 'rack', 'knob', 'lid', 'holder', 'bottle', 'handle', 'white', 'room', 'hook', 'rug', 'tile', 'tub', 'paper', 'drawer', 'mirror', 'ceiling', 'frame', 'can', 'picture', 'small', 'tank', 'mat', 'shelf', 'lamp', 'ring', 'black', 'outlet', 'open', 'basket', 'vent', 'bag', 'fixture', 'trash', 'table', 'box'] 2022-03-16 13:05:43,208.208 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'next', 'building', 'door', 'light', 'floor', 'wall', 'seat', 'paper', 'window', 'handle', 'bathroom', 'sink', 'holder', 'towel', 'curtain', 'toilet', 'sweater', 'knob', 'vent'] 2022-03-16 13:08:06,736.736 2829:trainer.py:487 do_train_dict(): eta: 22:18:54 iter: 18100 speed: 302.0 images/sec total_norm: 129.3557 (132.3418) loss: 153.9916 (155.1580) masked_loss: 1.6476 (1.6790) tag_loss: 152.1224 (153.4790) time: 1.4326 (1.6956) data: 0.0001 (0.0001) to_device: 0.0051 (0.0051) time_gpu: 1.4273 (1.6903) save_time: 9.0279 (30.4322) lr: 0.000073 max mem: 26307 2022-03-16 13:08:07,097.097 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5 2022-03-16 13:08:07,097.097 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 144.42556762695312 2022-03-16 13:08:07,097.097 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.9396828871507 2022-03-16 13:08:16,978.978 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01840473897755146 2022-03-16 13:08:16,979.979 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 13:08:16,979.979 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'the', 'clock', 'tower', '[MASK]', 'union', 'station', 'lights', 'up', 'in', 'the', 'evening', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 13:08:16,994.994 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['flag', 'sky', 'light', 'clock', 'tower', 'pole', 'lamp', 'building', 'window', 'sign', 'hand', 'letter', 'balcony', 'american', 'street', '[UNK]', 'night', 'top', 'roof', 'railing', 'wall', 'blue', 'ball', 'word', 'post', 'tree', 'ring', 'traffic', 'lit', 'large', 'city', 'pillar', 'number', 'tall', 'front', 'wire', 'band', 'antenna', 'dome', 'face', 'brick', 'time', 'background', 'bell', 'spire', 'side', 'red', 'green', 'line', 'column'] 2022-03-16 13:08:32,924.924 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'american', 'hand', 'light', 'window', 'tower', 'letter', 'sky', 'evening', 'clock', 'flag', 'pole', 'wire', 'lamp', 'balcony'] 2022-03-16 13:10:56,421.421 2829:trainer.py:487 do_train_dict(): eta: 22:16:17 iter: 18200 speed: 301.7 images/sec total_norm: 137.3312 (138.9078) loss: 151.8200 (152.5950) masked_loss: 1.6154 (1.6629) tag_loss: 150.0909 (150.9321) time: 1.4334 (1.6969) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4282 (1.6918) save_time: 9.0279 (30.4322) lr: 0.000073 max mem: 26307 2022-03-16 13:10:56,784.784 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6571428775787354 2022-03-16 13:10:56,784.784 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 150.60134887695312 2022-03-16 13:10:56,784.784 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.95259657062468 2022-03-16 13:11:06,639.639 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018465938046574593 2022-03-16 13:11:06,640.640 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 13:11:06,640.640 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'is', '[MASK]', 'at', 'a', 'very', 'large', 'carrot', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 13:11:06,656.656 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['eye', 'nose', 'man', 'wall', 'hand', 'ear', 'lamp', 'head', 'carrot', 'finger', 'mouth', 'face', 'shirt', 'eyebrow', '[UNK]', 'lip', 'shade', 'nail', 'window', 'hair', 'picture', 'chin', 'plant', 'forehead', 'door', 'curtain', 'frame', 'sweater', 'beard', 'jacket', 'orange', 'thumb', 'handle', 'couch', 'ceiling', 'table', 'mustache', 'button', 'shelf', 'cup', 'neck', 'cabinet', 'arm', 'vase', 'chair', 'sleeve', 'stem', 'ring', 'television', 'front'] 2022-03-16 13:11:22,707.707 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'large', 'mouth', 'wall', 'eye', 'shirt', 'picture', 'finger', 'nose', 'ear', 'frame', 'mirror', 'lip', 'couch', 'shade', 'eyebrow', 'pillow', 'beard', 'lamp', 'sofa', 'nail', 'vase', 'carrot'] 03-16 13:12:38.029 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 13:12:38.029 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 13:12:39.242 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 13:13:46,228.228 2829:trainer.py:487 do_train_dict(): eta: 22:13:40 iter: 18300 speed: 301.5 images/sec total_norm: 134.2271 (139.2529) loss: 153.4698 (156.0047) masked_loss: 1.6882 (1.6909) tag_loss: 151.8878 (154.3138) time: 1.4343 (1.6981) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4293 (1.6929) save_time: 9.0279 (30.4322) lr: 0.000072 max mem: 26307 2022-03-16 13:13:46,589.589 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6969696879386902 2022-03-16 13:13:46,589.589 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 136.46722412109375 2022-03-16 13:13:46,589.589 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.97382213758385 2022-03-16 13:13:56,464.464 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018461301922798157 2022-03-16 13:13:56,464.464 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 13:13:56,465.465 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'driving', 'down', 'a', 'street', 'near', 'a', 'business', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 13:13:56,480.480 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['building', 'sky', 'sign', 'tree', 'street', 'pole', 'light', 'sidewalk', 'window', 'car', '[UNK]', 'person', 'road', 'lamp', 'city', 'man', 'store', 'line', 'shirt', 'woman', 'post', 'wall', 'jacket', 'roof', 'arrow', 'cloud', 'jean', 'clock', 'can', 'curb', 'banner', 'bus', 'fence', 'van', 'hair', 'fire', 'truck', 'bicycle', 'bag', 'door', 'traffic', 'flower', 'trash', 'plate', 'flag', 'circle', 'plant', 'license', 'bike', 'tire'] 2022-03-16 13:14:12,384.384 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['building', 'road', 'street', 'light', 'business', 'car', 'post', 'person', 'wall', 'window', 'tree', 'sign', 'sky', 'wheel', 'bush', 'cloud', 'pole', 'meter', 'lamp', 'sidewalk', 'tire'] 2022-03-16 13:16:36,117.117 2829:trainer.py:487 do_train_dict(): eta: 22:11:03 iter: 18400 speed: 301.4 images/sec total_norm: 132.8894 (136.5378) loss: 152.1600 (152.8613) masked_loss: 1.6626 (1.7019) tag_loss: 150.3112 (151.1594) time: 1.4329 (1.6989) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4279 (1.6939) save_time: 9.0279 (30.4322) lr: 0.000072 max mem: 26307 2022-03-16 13:16:36,480.480 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6000000238418579 2022-03-16 13:16:36,481.481 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 128.72280883789062 2022-03-16 13:16:36,481.481 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.99499738538587 2022-03-16 13:16:46,443.443 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018462149426341057 2022-03-16 13:16:46,443.443 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 13:16:46,443.443 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'tennis', 'player', 'serves', 'a', 'ball', 'during', 'a', 'tennis', 'game', '[MASK]', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 13:16:46,458.458 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shoe', 'court', '[UNK]', 'tennis', 'shadow', 'man', 'sock', 'shirt', 'short', 'hand', 'leg', 'line', 'head', 'wall', 'arm', 'hair', 'ball', 'person', 'ground', 'hat', 'player', 'letter', 'logo', 'cap', 'sign', 'fence', 'camera', 'woman', 'knee', 'chair', 'stand', 'handle', 'pole', 'banner', 'advertisement', 'skirt', 'stripe', 'male', 'outfit', 'band', 'match', 'sunglasses', 'air', 'top', 'flower', 'face', 'light', 'boy', 'clock', 'crowd'] 2022-03-16 13:17:02,432.432 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'game', 'line', 'player', 'court', 'short', 'ground', 'hair', 'person', 'wall', 'arm', 'ball', 'letter', 'shirt', 'leg', 'camera', 'tennis', 'shadow', 'fan', 'hat', 'globe', 'shoe', 'sock'] 2022-03-16 13:19:25,965.965 2829:trainer.py:487 do_train_dict(): eta: 22:08:25 iter: 18500 speed: 301.4 images/sec total_norm: 134.1048 (137.8646) loss: 151.1212 (151.3907) masked_loss: 1.6932 (1.6862) tag_loss: 149.7786 (149.7045) time: 1.4334 (1.6984) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4284 (1.6932) save_time: 9.0279 (30.4322) lr: 0.000072 max mem: 26307 2022-03-16 13:19:26,327.327 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5757575631141663 2022-03-16 13:19:26,328.328 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 179.7282257080078 2022-03-16 13:19:26,328.328 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.99253123293641 2022-03-16 13:19:36,340.340 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018474362790584564 2022-03-16 13:19:36,340.340 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 13:19:36,340.340 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'pitcher', 'with', 'his', 'arm', 'up', 'ancestors', 'ready', 'to', 'throw', 'a', 'ball', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 13:19:36,356.356 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['man', 'head', 'shirt', 'glove', 'baseball', 'ear', 'hat', 'nose', 'logo', 'face', 'arm', 'cap', 'hand', 'necklace', '[UNK]', 'eye', 'mouth', 'ball', 'belt', 'player', 'grass', 'hair', 'letter', 'number', 'field', 'jersey', 'tree', 'neck', 'stripe', 'fence', 'finger', 'sleeve', 'leg', 'uniform', 'buckle', 'ground', 'pitcher', 'pitch', 'mound', 'pole', 'dirt', 'wall', 'elbow', 'background', 'writing', 'shadow', 'wrist', 'person', 'net', 'top'] 2022-03-16 13:19:52,263.263 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'hand', 'number', 'face', 'field', 'mouth', 'arm', 'ready', 'eye', 'baseball', 'ball', 'letter', 'shirt', 'jersey', 'finger', 'nose', 'ear', 'grass', 'bush', 'hat', 'cap', 'pitcher', 'logo', 'fence', 'sleeve', 'necklace', 'glove', 'stripe'] 2022-03-16 13:22:15,820.820 2829:trainer.py:487 do_train_dict(): eta: 22:05:48 iter: 18600 speed: 301.4 images/sec total_norm: 133.3804 (136.7868) loss: 153.2503 (157.0127) masked_loss: 1.7232 (1.7141) tag_loss: 151.7137 (155.2986) time: 1.4333 (1.6986) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4281 (1.6935) save_time: 9.0279 (30.4322) lr: 0.000072 max mem: 26307 2022-03-16 13:22:16,180.180 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5714285969734192 2022-03-16 13:22:16,180.180 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 157.94696044921875 2022-03-16 13:22:16,180.180 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.98324886872807 2022-03-16 13:22:26,186.186 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018502885475754738 2022-03-16 13:22:26,187.187 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 13:22:26,187.187 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'kite', 'is', 'seen', '[MASK]', 'high', 'on', 'a', '[MASK]', 'day', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 13:22:26,202.202 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'water', 'airplane', 'ocean', 'beach', 'kite', '[UNK]', 'person', 'tail', 'wave', 'wing', 'boat', 'jet', 'plane', 'air', 'man', 'shirt', 'sand', 'horizon', 'cloud', 'arm', 'bird', 'sun', 'large', 'hair', 'couple', 'head', 'chair', 'shore', 'blue', 'object', 'body', 'jacket', 'day', 'high', 'pole', 'cloudy', 'leg', 'small', 'woman', 'next', 'ship', 'clear', 'short', 'light', 'sea', 'hand', 'top', 'tree', 'low'] 2022-03-16 13:22:42,095.095 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['high', 'day', 'water', 'person', 'couple', 'beach', 'sky', 'shirt', 'ocean', 'airplane', 'kite', 'cloudy'] 2022-03-16 13:25:05,687.687 2829:trainer.py:487 do_train_dict(): eta: 22:03:11 iter: 18700 speed: 301.4 images/sec total_norm: 132.8717 (135.4333) loss: 152.6264 (154.7283) masked_loss: 1.7588 (1.7485) tag_loss: 151.3432 (152.9798) time: 1.4325 (1.6987) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4275 (1.6936) save_time: 9.0279 (30.4322) lr: 0.000072 max mem: 26307 2022-03-16 13:25:06,049.049 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-16 13:25:06,050.050 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 148.09976196289062 2022-03-16 13:25:06,050.050 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.97958467361775 2022-03-16 13:25:16,158.158 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018503960222005844 2022-03-16 13:25:16,158.158 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 13:25:16,159.159 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'black', 'and', 'white', 'photograph', 'of', 'a', 'person', '[MASK]', 'bicycle', 'turning', 'onto', 'a', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 13:25:16,174.174 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['road', 'line', 'light', 'street', 'sign', 'night', 'car', 'pole', '[UNK]', 'sky', 'tree', 'photo', 'ground', 'highway', 'picture', 'sidewalk', 'shadow', 'photograph', 'building', 'white', 'track', 'background', 'dark', 'side', 'lane', 'black', 'reflection', 'traffic', 'number', 'rail', 'arrow', 'median', 'wheel', 'person', 'window', 'railroad', 'meter', 'vehicle', 'image', 'wall', 'fence', 'train', 'letter', 'front', 'arm', 'object', 'empty', 'mirror', 'snow', 'top'] 2022-03-16 13:25:32,130.130 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'line', 'white', 'road', 'light', 'car', 'hair', 'person', 'sign', 'shirt', 'wheel', 'pole', 'bike', 'photograph', 'bicycle', 'tire', 'backpack', 'stripe'] 2022-03-16 13:27:55,518.518 2829:trainer.py:487 do_train_dict(): eta: 22:00:33 iter: 18800 speed: 301.5 images/sec total_norm: 134.5318 (136.1979) loss: 149.8090 (152.3738) masked_loss: 1.7205 (1.7096) tag_loss: 148.0998 (150.6642) time: 1.4322 (1.6983) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4272 (1.6931) save_time: 9.0279 (30.4322) lr: 0.000072 max mem: 26307 2022-03-16 13:27:55,878.878 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5428571701049805 2022-03-16 13:27:55,879.879 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 155.90335083007812 2022-03-16 13:27:55,879.879 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.97973515747717 2022-03-16 13:28:06,044.044 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01851644739508629 2022-03-16 13:28:06,044.044 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 13:28:06,045.045 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'flat', 'tv', 'screen', '[MASK]', 'on', 'top', '[MASK]', 'a', 'book', 'shelf', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 13:28:06,060.060 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['television', 'wall', 'shelf', 'book', 'floor', 'room', '[UNK]', 'railing', 'vent', 'stair', 'chair', 'screen', 'staircase', 'picture', 'carpet', 'rug', 'light', 'stand', 'building', 'speaker', 'living', 'center', 'hair', 'rail', 'window', 'entertainment', 'player', 'ceiling', 'game', 'clock', 'man', 'cabinet', 'boy', 'door', 'couch', 'table', 'step', 'dvd', 'cushion', 'tv', 'box', 'fish', 'post', 'ball', 'lamp', 'remote', 'person', 'leg', 'controller', 'vase'] 2022-03-16 13:28:22,014.014 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'top', 'book', 'television', 'post', 'floor', 'wall', 'stand', 'bar', 'step', 'cd', 'screen', 'dog', 'flat', 'dvd', 'clock', 'cabinet', 'carpet', 'shelf', 'drawer', 'outlet', 'stool', 'railing', 'knob', 'vent', 'stair'] 2022-03-16 13:30:45,705.705 2829:trainer.py:487 do_train_dict(): eta: 21:57:56 iter: 18900 speed: 300.8 images/sec total_norm: 131.1916 (134.0222) loss: 151.5475 (150.7730) masked_loss: 1.6999 (1.7189) tag_loss: 149.3567 (149.0541) time: 1.4326 (1.7019) data: 0.0001 (0.0005) to_device: 0.0051 (0.0050) time_gpu: 1.4273 (1.6964) save_time: 9.0279 (30.4322) lr: 0.000072 max mem: 26307 2022-03-16 13:30:46,067.067 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6363636255264282 2022-03-16 13:30:46,067.067 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 153.66168212890625 2022-03-16 13:30:46,067.067 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.98123670879163 2022-03-16 13:30:56,195.195 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01858404651284218 2022-03-16 13:30:56,195.195 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 13:30:56,195.195 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'rural', 'area', 'with', 'several', 'cars', 'parked', 'and', 'a', 'plant', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 13:30:56,210.210 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['line', 'tree', 'road', 'sign', 'street', 'pole', 'building', 'sky', 'car', 'house', 'window', 'fence', 'sidewalk', 'railing', 'wall', 'ground', 'letter', 'curb', 'roof', '[UNK]', 'bush', 'trash', 'chair', 'plant', 'word', 'grass', 'basket', 'rail', 'arrow', 'post', 'light', 'side', 'can', 'parking', 'box', 'empty', 'writing', 'graffiti', 'back', 'background', 'door', 'tire', 'bicycle', 'balcony', 'leaf', 'city', 'stop', 'next', 'corner', 'wire'] 2022-03-16 13:31:12,070.070 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['house', 'area', 'several', 'line', 'building', 'road', 'street', 'car', 'wall', 'chair', 'plant', 'window', 'tree', 'rural', 'growing', 'sign', 'sky', 'background', 'pole', 'arrow', 'fence', 'sidewalk', 'curb'] 2022-03-16 13:33:35,839.839 2829:trainer.py:487 do_train_dict(): eta: 21:55:19 iter: 19000 speed: 300.9 images/sec total_norm: 133.8207 (134.2490) loss: 153.2930 (152.0901) masked_loss: 1.7395 (1.7414) tag_loss: 151.4269 (150.3487) time: 1.4338 (1.7013) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4286 (1.6962) save_time: 9.0279 (30.4322) lr: 0.000071 max mem: 26307 2022-03-16 13:33:36,201.201 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4324324429035187 2022-03-16 13:33:36,201.201 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 165.6912078857422 2022-03-16 13:33:36,201.201 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.9755395619657 2022-03-16 13:33:46,476.476 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018588216975331306 2022-03-16 13:33:46,476.476 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 13:33:46,476.476 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'living', '[MASK]', 'decor', 'with', 'couch', ',', 'love', 'seat', '[MASK]', 'rec', '##liner', ',', 'and', 'tv', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 13:33:46,491.491 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['curtain', 'television', 'wall', 'floor', 'window', 'room', 'chair', 'couch', 'living', 'rod', 'stand', 'remote', 'speaker', 'door', 'light', 'table', 'ceiling', 'bowl', 'sofa', 'lamp', 'basket', 'entertainment', '[UNK]', 'beam', 'mirror', 'center', 'coffee', 'pillow', 'tv', 'rug', 'picture', 'control', 'armchair', 'tile', 'fan', 'arm', 'furniture', 'shelf', 'ottoman', 'vase', 'cushion', 'flower', 'large', 'building', 'candle', 'flat', 'leather', 'plant', 'shade', 'cabinet'] 2022-03-16 13:34:02,436.436 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'room', 'love', 'door', 'light', 'television', 'floor', 'table', 'wall', 'seat', 'stand', 'chair', 'window', 'bowl', 'speaker', 'ceiling', 'couch', 'remote', 'rod', 'pillow', 'curtain', 'decor'] 2022-03-16 13:36:26,087.087 2829:trainer.py:487 do_train_dict(): eta: 21:52:42 iter: 19100 speed: 300.7 images/sec total_norm: 137.3558 (139.3374) loss: 154.1020 (155.2701) masked_loss: 1.6360 (1.6575) tag_loss: 152.4232 (153.6127) time: 1.4354 (1.7025) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4303 (1.6973) save_time: 9.0279 (30.4322) lr: 0.000071 max mem: 26307 2022-03-16 13:36:26,448.448 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-16 13:36:26,448.448 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 145.3153076171875 2022-03-16 13:36:26,448.448 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.9952248732249 2022-03-16 13:36:36,758.758 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01856849528849125 2022-03-16 13:36:36,758.758 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 13:36:36,758.758 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'orange', 'and', 'gold', 'tour', 'amongst', 'parked', 'near', 'a', 'building', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 13:36:36,773.773 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'bus', 'window', 'windshield', 'cloud', 'tree', 'light', 'mirror', 'building', 'road', 'street', 'roof', '[UNK]', 'pole', 'wheel', 'fence', 'tire', 'sign', 'line', 'front', 'license', 'plate', 'person', 'door', 'shadow', 'car', 'stripe', 'grass', 'bumper', 'curb', 'logo', 'sidewalk', 'top', 'house', 'ground', 'lot', 'man', 'truck', 'white', 'hair', 'parking', 'traffic', 'bush', 'side', 'jean', 'shirt', 'letter', 'barrier', 'number', 'red'] 2022-03-16 13:36:52,697.697 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'line', 'building', 'door', 'road', 'front', 'street', 'light', 'car', 'person', 'gold', 'tour', 'wall', 'window', 'sign', 'sky', 'bus', 'roof', 'plate', 'shadow', 'wheel', 'mirror', 'grass', 'license', 'cloud', 'logo', 'sidewalk', 'tire', 'curb', 'windshield'] 2022-03-16 13:39:16,276.276 2829:trainer.py:487 do_train_dict(): eta: 21:50:05 iter: 19200 speed: 300.8 images/sec total_norm: 135.2329 (137.9147) loss: 152.9567 (152.9809) masked_loss: 1.6233 (1.6778) tag_loss: 151.2971 (151.3030) time: 1.4321 (1.7019) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4271 (1.6968) save_time: 9.0279 (30.4322) lr: 0.000071 max mem: 26307 2022-03-16 13:39:16,637.637 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5428571701049805 2022-03-16 13:39:16,637.637 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 187.38980102539062 2022-03-16 13:39:16,637.637 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.95733359934752 2022-03-16 13:39:26,925.925 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018559526652097702 2022-03-16 13:39:26,925.925 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 13:39:26,925.925 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'woman', 'carrying', 'a', 'bright', 'yellow', 'bag', 'and', 'holding', 'a', 'black', '[MASK]', '[MASK]', 'her', 'head', 'is', 'walking', 'down', '##nay', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 13:39:26,941.941 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['fence', 'umbrella', 'ground', 'tree', 'woman', 'person', 'shoe', '[UNK]', 'leaf', 'sidewalk', 'leg', 'park', 'hair', 'bag', 'jean', 'hand', 'man', 'shirt', 'bench', 'jacket', 'ivy', 'bush', 'hat', 'girl', 'trunk', 'head', 'arm', 'coat', 'dress', 'wall', 'dirt', 'vine', 'purse', 'grass', 'skirt', 'couple', 'curb', 'plant', 'short', 'gate', 'weed', 'top', 'pole', 'backpack', 'gravel', 'foot', 'lady', 'can', 'flower', 'street'] 2022-03-16 13:39:42,840.840 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['city', 'head', 'hand', 'black', 'park', 'street', 'woman', 'ground', 'person', 'wall', 'stone', 'plant', 'tree', 'jean', 'yellow', 'shirt', 'bright', 'bag', 'gate', 'leaf', 'ivy', 'fence', 'shoe', 'cane', 'sidewalk', 'umbrella', 'weed'] 2022-03-16 13:42:06,669.669 2829:trainer.py:487 do_train_dict(): eta: 21:47:28 iter: 19300 speed: 300.5 images/sec total_norm: 136.0160 (139.3437) loss: 148.8186 (152.0599) masked_loss: 1.6841 (1.7017) tag_loss: 147.1089 (150.3581) time: 1.4345 (1.7039) data: 0.0001 (0.0001) to_device: 0.0051 (0.0051) time_gpu: 1.4293 (1.6987) save_time: 9.0279 (30.4322) lr: 0.000071 max mem: 26307 2022-03-16 13:42:07,029.029 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7058823704719543 2022-03-16 13:42:07,029.029 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 130.18019104003906 2022-03-16 13:42:07,029.029 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.98372431882878 2022-03-16 13:42:17,459.459 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01857331395149231 2022-03-16 13:42:17,459.459 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 13:42:17,459.459 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'there', 'are', 'people', 'on', 'ski', '##s', 'in', 'the', 'snow', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 13:42:17,475.475 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'jacket', 'sky', 'snow', 'pole', 'ski', 'tree', 'mountain', 'glove', 'person', 'woman', 'head', 'coat', 'hand', 'boot', 'cloud', 'ground', 'helmet', 'face', 'hat', 'leg', 'man', 'hair', 'girl', 'skier', 'hill', 'top', 'poles', 'hood', 'snowy', 'slope', 'logo', 'rock', 'arm', 'foot', 'shoe', 'gear', 'strap', 'zipper', 'couple', 'scarf', 'lift', 'sign', 'building', 'board', 'backpack', 'picture', 'glasses', 'front', 'sunglasses'] 2022-03-16 13:42:33,432.432 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'face', 'woman', 'person', 'mountain', 'tree', 'sky', 'snow', 'hat', 'pole', 'jacket', 'ski', 'boot', 'helmet', 'glove', 'strap', 'stripe', 'zipper'] 03-16 13:42:39.341 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 13:42:39.341 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 13:42:40.567 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 13:44:56,977.977 2829:trainer.py:487 do_train_dict(): eta: 21:44:50 iter: 19400 speed: 300.6 images/sec total_norm: 135.7278 (136.0436) loss: 152.6704 (152.0363) masked_loss: 1.7860 (1.7910) tag_loss: 151.1162 (150.2453) time: 1.4347 (1.7032) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4297 (1.6981) save_time: 9.0279 (30.4322) lr: 0.000071 max mem: 26307 2022-03-16 13:44:57,338.338 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5757575631141663 2022-03-16 13:44:57,338.338 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 145.83840942382812 2022-03-16 13:44:57,338.338 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.96997005756084 2022-03-16 13:45:07,744.744 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01858498528599739 2022-03-16 13:45:07,744.744 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 13:45:07,745.745 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'the', 'inn', '##ards', 'of', 'a', 'wall', 'have', 'been', '[MASK]', 'during', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 13:45:07,760.760 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'floor', 'pipe', 'ground', 'shelf', 'bag', 'tool', 'room', 'handle', 'box', '[UNK]', 'paper', 'door', 'cord', 'bucket', 'hose', 'rock', 'board', 'cardboard', 'wire', 'window', 'pole', 'broom', 'drain', 'bathroom', 'hole', 'bottle', 'wheel', 'dirty', 'stain', 'suitcase', 'table', 'container', 'hammer', 'lid', 'cup', 'leg', 'building', 'construction', 'item', 'toilet', 'strap', 'tile', 'knob', 'light', 'towel', 'wood', 'ladder', 'old', 'outlet'] 2022-03-16 13:45:23,712.712 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['room', 'building', 'case', 'ground', 'floor', 'construction', 'wall', 'paper', 'window', 'box', 'card', 'bag', 'handle', 'pole', 'tool', 'pipe', 'item', 'shelf', 'drain', 'tile', 'broom'] 2022-03-16 13:47:47,351.351 2829:trainer.py:487 do_train_dict(): eta: 21:42:13 iter: 19500 speed: 300.5 images/sec total_norm: 132.4792 (135.8304) loss: 149.4608 (148.4182) masked_loss: 1.6395 (1.6832) tag_loss: 147.6708 (146.7350) time: 1.4342 (1.7037) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4292 (1.6986) save_time: 9.0279 (30.4322) lr: 0.000071 max mem: 26307 2022-03-16 13:47:47,712.712 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5526315569877625 2022-03-16 13:47:47,713.713 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 147.90469360351562 2022-03-16 13:47:47,713.713 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.97463286652857 2022-03-16 13:47:58,197.197 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018621616065502167 2022-03-16 13:47:58,197.197 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 13:47:58,198.198 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'haul', 'of', 'bananas', ',', 'bread', ',', 'onions', ',', 'potatoes', ',', 'milk', ',', '[MASK]', 'and', 'more', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 13:47:58,213.213 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['banana', 'bag', '[UNK]', 'plastic', 'table', 'fruit', 'label', 'package', 'apple', 'market', 'sign', 'bunch', 'box', 'garlic', 'food', 'vegetable', 'tag', 'onion', 'orange', 'tomato', 'carrot', 'bread', 'bananas', 'floor', 'writing', 'stem', 'potato', 'cookie', 'logo', 'leaf', 'chip', 'basket', 'store', 'wall', 'sale', 'pile', 'produce', 'hand', 'paper', 'different', 'full', 'letter', 'display', 'grocery', 'coconut', 'container', 'light', 'person', 'top', 'sack'] 2022-03-16 13:48:14,092.092 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'can', 'table', 'food', 'box', 'bag', 'plastic', 'apple', 'tag', 'milk', 'cream', 'bread', 'logo', 'grocery', 'haul', 'banana', 'strap', 'cookie'] 2022-03-16 13:50:37,774.774 2829:trainer.py:487 do_train_dict(): eta: 21:39:36 iter: 19600 speed: 300.4 images/sec total_norm: 135.4929 (139.4704) loss: 152.5865 (153.6256) masked_loss: 1.7373 (1.7342) tag_loss: 151.3774 (151.8914) time: 1.4339 (1.7043) data: 0.0002 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4289 (1.6993) save_time: 9.0279 (30.4322) lr: 0.000070 max mem: 26307 2022-03-16 13:50:38,135.135 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7058823704719543 2022-03-16 13:50:38,135.135 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 112.56956481933594 2022-03-16 13:50:38,135.135 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.01120473406641 2022-03-16 13:50:48,696.696 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01861617900431156 2022-03-16 13:50:48,697.697 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 13:50:48,697.697 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'close', '##up', 'of', 'a', 'pizza', 'with', '[MASK]', 'and', 'sauce', 'on', 'a', 'pan', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 13:50:48,712.712 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['pizza', '[UNK]', 'plate', 'crust', 'cheese', 'handle', 'sink', 'knife', 'pan', 'table', 'stove', 'shadow', 'sauce', 'tray', 'bowl', 'bubble', 'glass', 'top', 'meat', 'cup', 'background', 'bottle', 'oven', 'mushroom', 'spoon', 'light', 'cloth', 'slice', 'white', 'hole', 'metal', 'food', 'dish', 'olive', 'large', 'container', 'napkin', 'base', 'water', 'close', 'ready', 'fork', 'wall', 'stripe', 'surface', 'object', 'small', 'knob', 'cooked', 'screw'] 2022-03-16 13:51:04,702.702 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'light', 'table', 'paper', 'background', 'roll', 'handle', 'plate', 'knife', 'pan', 'sink', 'cheese', 'towel', 'pizza', 'sauce', 'spoon', 'stove', 'oven', 'crust', 'bun'] 2022-03-16 13:53:28,224.224 2829:trainer.py:487 do_train_dict(): eta: 21:36:59 iter: 19700 speed: 300.4 images/sec total_norm: 132.4779 (136.2762) loss: 146.1230 (148.4942) masked_loss: 1.6040 (1.6749) tag_loss: 144.3014 (146.8193) time: 1.4332 (1.7045) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4284 (1.6995) save_time: 9.0279 (30.4322) lr: 0.000070 max mem: 26307 2022-03-16 13:53:28,587.587 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5277777910232544 2022-03-16 13:53:28,587.587 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 194.71005249023438 2022-03-16 13:53:28,587.587 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.98912100358443 2022-03-16 13:53:39,197.197 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018570857122540474 2022-03-16 13:53:39,197.197 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 13:53:39,198.198 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'bath', 'tub', 'sitting', 'next', 'to', 'a', '[MASK]', '[MASK]', 'in', 'a', 'bathroom', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 13:53:39,213.213 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['bathroom', 'wall', '[UNK]', 'mirror', 'toilet', 'sink', 'tub', 'shower', 'floor', 'towel', 'head', 'handle', 'lid', 'rack', 'ceiling', 'tile', 'light', 'paper', 'soap', 'holder', 'outlet', 'door', 'bar', 'dish', 'drain', 'shelf', 'white', 'glass', 'tank', 'reflection', 'rod', 'box', 'vent', 'large', 'bath', 'knob', 'roll', 'bottle', 'window', 'plate', 'clean', 'cabinet', 'switch', 'next', 'tissue', 'sign', 'small', 'hand', 'ledge', 'bowl'] 2022-03-16 13:53:55,156.156 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'next', 'white', 'door', 'floor', 'wall', 'paper', 'window', 'roll', 'handle', 'mirror', 'bathroom', 'ceiling', 'shower', 'bath', 'sink', 'soap', 'pipe', 'holder', 'towel', 'shelf', 'toilet', 'outlet', 'tile', 'tub', 'rack'] 2022-03-16 13:56:18,888.888 2829:trainer.py:487 do_train_dict(): eta: 21:34:22 iter: 19800 speed: 300.0 images/sec total_norm: 133.4636 (136.8923) loss: 155.7876 (153.8654) masked_loss: 1.6727 (1.7413) tag_loss: 153.7996 (152.1241) time: 1.4343 (1.7066) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4294 (1.7014) save_time: 9.0279 (30.4322) lr: 0.000070 max mem: 26307 2022-03-16 13:56:19,249.249 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6176470518112183 2022-03-16 13:56:19,249.249 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 154.24099731445312 2022-03-16 13:56:19,249.249 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.99470502767132 2022-03-16 13:56:29,852.852 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01856466569006443 2022-03-16 13:56:29,852.852 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 13:56:29,854.854 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'group', 'of', 'three', 'people', 'standing', 'in', 'the', 'snow', 'and', 'holding', '[MASK]', '[MASK]', 'in', '[MASK]', 'hands', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 13:56:29,869.869 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['glove', 'jacket', 'building', 'ski', 'snow', 'window', '[UNK]', 'man', 'head', 'fence', 'ground', 'coat', 'pole', 'helmet', 'boot', 'person', 'hand', 'tag', 'door', 'hat', 'face', 'railing', 'wall', 'tree', 'woman', 'shoe', 'leg', 'roof', 'next', 'gear', 'red', 'house', 'balcony', 'sign', 'poles', 'pile', 'camera', 'boy', 'handle', 'skier', 'backpack', 'gate', 'foot', 'snowy', 'light', 'flag', 'arm', 'badge', 'front', 'couple'] 2022-03-16 13:56:45,819.819 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'group', 'hand', 'building', 'door', 'woman', 'ground', 'person', 'base', 'window', 'tree', 'snow', 'coat', 'hat', 'tag', 'pole', 'jacket', 'ski', 'fence', 'boot', 'helmet', 'poles', 'shoe', 'glove'] 2022-03-16 13:59:09,346.346 2829:trainer.py:487 do_train_dict(): eta: 21:31:44 iter: 19900 speed: 300.4 images/sec total_norm: 133.3119 (138.6806) loss: 150.1005 (148.5715) masked_loss: 1.6454 (1.6685) tag_loss: 148.6084 (146.9031) time: 1.4334 (1.7046) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4286 (1.6997) save_time: 9.0279 (30.4322) lr: 0.000070 max mem: 26307 2022-03-16 13:59:09,707.707 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5454545617103577 2022-03-16 13:59:09,707.707 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 158.2715606689453 2022-03-16 13:59:09,707.707 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.00655279159545 2022-03-16 13:59:20,371.371 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.0185573510825634 2022-03-16 13:59:20,371.371 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 13:59:20,372.372 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'gi', '##raf', '##fe', 'bending', 'over', 'and', 'eating', '[MASK]', 'grass', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 13:59:20,387.387 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'head', 'spot', 'neck', 'mane', 'eye', 'tree', 'wall', 'ear', 'horn', 'ground', 'rock', 'mouth', 'zoo', 'trunk', 'grass', 'fence', 'hair', 'nose', 'pole', 'hay', 'plant', 'leg', 'face', 'tongue', 'bush', 'branch', 'enclosure', 'boulder', 'dirt', 'leaf', 'paw', 'next', 'pen', 'rope', 'trough', 'basket', 'stick', 'building', 'food', 'tail', 'container', 'ledge', 'standing', 'shirt', 'post', 'water', 'bird', 'handle', 'mesh'] 2022-03-16 13:59:36,320.320 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'ground', 'rock', 'mouth', 'wall', 'eye', 'neck', 'tree', 'spot', 'leg', 'tongue', 'ear', 'grass', 'pole', 'leaf', 'horn', 'fence', 'zoo', 'cord', 'bending', 'mane'] 2022-03-16 14:02:00,056.056 2829:trainer.py:487 do_train_dict(): eta: 21:29:07 iter: 20000 speed: 299.9 images/sec total_norm: 137.3051 (139.5711) loss: 150.4454 (151.1837) masked_loss: 1.6972 (1.6941) tag_loss: 149.2023 (149.4895) time: 1.4353 (1.7071) data: 0.0001 (0.0005) to_device: 0.0051 (0.0050) time_gpu: 1.4300 (1.7016) save_time: 9.0279 (30.4322) lr: 0.000070 max mem: 26307 2022-03-16 14:02:00,058.058 2829:checkpoint.py:222 save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt 2022-03-16 14:02:09,169.169 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6315789222717285 2022-03-16 14:02:09,169.169 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 174.6197967529297 2022-03-16 14:02:09,169.169 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.00547940458232 2022-03-16 14:02:19,866.866 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018538890406489372 2022-03-16 14:02:19,866.866 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 14:02:19,866.866 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'bed', 'sitting', 'in', 'a', 'bedroom', 'next', 'to', 'a', 'window', '[MASK]', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 14:02:19,881.881 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['room', 'curtain', 'bed', 'chair', 'lamp', 'wall', 'floor', 'table', 'window', 'pillow', 'blanket', 'shade', 'bedroom', 'base', 'hotel', 'carpet', '[UNK]', 'vent', 'desk', 'sheet', 'ceiling', 'light', 'armchair', 'mirror', 'nightstand', 'television', 'wheel', 'leg', 'picture', 'drawer', 'white', 'phone', 'large', 'cabinet', 'arm', 'paper', 'dresser', 'air', 'red', 'bag', 'back', 'handle', 'outlet', 'book', 'stand', 'door', 'remote', 'cushion', 'telephone', 'furniture'] 2022-03-16 14:02:35,604.604 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'room', 'television', 'floor', 'bed', 'table', 'wall', 'chair', 'window', 'bowl', 'desk', 'bedroom', 'handle', 'wheel', 'sheet', 'shade', 'blanket', 'pillow', 'carpet', 'lamp', 'curtain', 'drawer', 'dresser'] 2022-03-16 14:04:58,641.641 2829:trainer.py:487 do_train_dict(): eta: 21:26:48 iter: 20100 speed: 286.7 images/sec total_norm: 136.4110 (141.9151) loss: 149.8997 (151.4312) masked_loss: 1.6591 (1.6626) tag_loss: 148.0607 (149.7686) time: 1.4343 (1.7859) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4294 (1.6934) save_time: 8.8805 (25.0095) lr: 0.000070 max mem: 26307 2022-03-16 14:04:59,002.002 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5714285969734192 2022-03-16 14:04:59,003.003 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 167.5060577392578 2022-03-16 14:04:59,003.003 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.99254791335304 2022-03-16 14:05:09,813.813 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01850227825343609 2022-03-16 14:05:09,813.813 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 14:05:09,814.814 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'family', 'is', 'grouped', 'on', '[MASK]', 'sun', 'porch', '[MASK]', 'a', 'photo', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 14:05:09,829.829 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hair', 'girl', 'tie', 'hand', 'head', 'shirt', 'wall', 'boy', 'jacket', 'child', 'woman', 'ear', 'scarf', '[UNK]', 'door', 'face', 'table', 'eye', 'sweater', 'window', 'group', 'nose', 'person', 'cup', 'picture', 'dress', 'young', 'paper', 'ponytail', 'floor', 'chair', 'jean', 'bag', 'kid', 'man', 'suit', 'little', 'basket', 'necklace', 'container', 'bottle', 'handle', 'stripe', 'bow', 'can', 'plate', 'shoe', 'glasses', 'bracelet', 'smile'] 2022-03-16 14:05:25,815.815 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'family', 'man', 'house', 'hand', 'face', 'door', 'woman', 'cup', 'hair', 'girl', 'child', 'wall', 'boy', 'sun', 'eye', 'window', 'shirt', 'ear', 'tie', 'photo', 'blind', 'jacket', 'porch', 'sweater', 'ponytail', 'scarf'] 2022-03-16 14:07:49,436.436 2829:trainer.py:487 do_train_dict(): eta: 21:24:10 iter: 20200 speed: 299.8 images/sec total_norm: 134.9062 (139.5070) loss: 152.6627 (153.4421) masked_loss: 1.6303 (1.6457) tag_loss: 150.9426 (151.7963) time: 1.4342 (1.7080) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4291 (1.7028) save_time: 8.8805 (25.0095) lr: 0.000070 max mem: 26307 2022-03-16 14:07:49,797.797 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.625 2022-03-16 14:07:49,798.798 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 169.94622802734375 2022-03-16 14:07:49,798.798 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.97797204003545 2022-03-16 14:08:00,580.580 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018487296998500824 2022-03-16 14:08:00,580.580 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 14:08:00,580.580 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'bananas', 'for', '[MASK]', 'that', 'are', '[MASK]', 'on', 'news', '##print', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 14:08:00,596.596 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['banana', 'table', 'newspaper', 'bunch', 'stem', '[UNK]', 'paper', 'fruit', 'person', 'picture', 'magazine', 'plate', 'man', 'pile', 'wall', 'bowl', 'pole', 'bag', 'box', 'cloth', 'basket', 'book', 'shirt', 'sign', 'ripe', 'display', 'spot', 'top', 'woman', 'background', 'bananas', 'writing', 'shelf', 'container', 'group', 'other', 'market', 'head', 'sale', 'photo', 'flower', 'hand', 'floor', 'plastic', 'hair', 'apple', 'orange', 'end', 'many', 'ground'] 2022-03-16 14:08:16,457.457 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'table', 'newspaper', 'picture', 'bowl', 'sale', 'plate', 'stem', 'bunch', 'shelf', 'banana'] 2022-03-16 14:10:40,199.199 2829:trainer.py:487 do_train_dict(): eta: 21:21:33 iter: 20300 speed: 299.8 images/sec total_norm: 135.6405 (138.1252) loss: 153.9690 (152.9460) masked_loss: 1.5877 (1.6659) tag_loss: 152.2151 (151.2801) time: 1.4326 (1.7076) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4274 (1.7025) save_time: 8.8805 (25.0095) lr: 0.000069 max mem: 26307 2022-03-16 14:10:40,559.559 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4848484992980957 2022-03-16 14:10:40,559.559 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 168.33628845214844 2022-03-16 14:10:40,559.559 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.98192755381267 2022-03-16 14:10:51,523.523 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018468158319592476 2022-03-16 14:10:51,523.523 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 14:10:51,523.523 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'there', 'is', 'a', 'bedroom', 'with', 'a', '[MASK]', 'and', 'desk', 'with', 'chair', 'in', 'it', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 14:10:51,538.538 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'floor', 'lamp', 'desk', 'keyboard', 'chair', 'door', 'computer', 'window', 'monitor', 'room', 'shade', 'bed', 'blind', 'rug', 'table', 'bag', 'bedroom', 'blanket', 'mouse', '[UNK]', 'cushion', 'leg', 'handle', 'pillow', 'picture', 'screen', 'laptop', 'drawer', 'outlet', 'office', 'light', 'box', 'knob', 'book', 'basket', 'switch', 'backpack', 'paper', 'shelf', 'phone', 'back', 'vent', 'home', 'frame', 'carpet', 'speaker', 'mat', 'base', 'pad'] 2022-03-16 14:11:07,456.456 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['room', 'door', 'light', 'floor', 'bed', 'wall', 'glass', 'chair', 'computer', 'window', 'bag', 'desk', 'bedroom', 'blind', 'remote', 'switch', 'monitor', 'shade', 'blanket', 'keyboard', 'pillow', 'lamp', 'backpack', 'mat', 'rug', 'cushion'] 03-16 14:12:40.618 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 14:12:40.618 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 14:12:41.737 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 94}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 14:13:31,043.043 2829:trainer.py:487 do_train_dict(): eta: 21:18:55 iter: 20400 speed: 299.7 images/sec total_norm: 134.2532 (136.4123) loss: 148.6723 (148.6688) masked_loss: 1.7369 (1.6979) tag_loss: 146.8832 (146.9710) time: 1.4342 (1.7084) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4291 (1.7033) save_time: 8.8805 (25.0095) lr: 0.000069 max mem: 26307 2022-03-16 14:13:31,403.403 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6470588445663452 2022-03-16 14:13:31,404.404 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 141.09841918945312 2022-03-16 14:13:31,404.404 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.99751797187619 2022-03-16 14:13:42,350.350 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01845521479845047 2022-03-16 14:13:42,350.350 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 14:13:42,351.351 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'man', 'running', 'down', 'part', 'of', 'a', 'half', 'pipe', 'while', 'holding', 'a', 'skate', '##board', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 14:13:42,366.366 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'helmet', 'man', '[UNK]', 'ramp', 'pad', 'shoe', 'knee', 'hand', 'short', 'head', 'wheel', 'sock', 'leg', 'elbow', 'skate', 'arm', 'line', 'ground', 'strap', 'shadow', 'glove', 'park', 'board', 'person', 'tree', 'face', 'logo', 'boy', 'skater', 'sky', 'wall', 'sign', 'mountain', 'trick', 'building', 'grass', 'fence', 'hair', 'guy', 'hill', 'slope', 'pole', 'background', 'sleeve', 'bowl', 'snow', 'street', 'light', 'house'] 2022-03-16 14:13:58,344.344 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'part', 'head', 'man', 'hand', 'face', 'half', 'short', 'ground', 'arm', 'shirt', 'snow', 'wheel', 'knee', 'pipe', 'elbow', 'helmet', 'shoe', 'pad', 'ramp', 'strap', 'sock'] 2022-03-16 14:16:21,941.941 2829:trainer.py:487 do_train_dict(): eta: 21:16:18 iter: 20500 speed: 299.6 images/sec total_norm: 133.7125 (136.8156) loss: 148.6973 (151.9906) masked_loss: 1.6160 (1.6008) tag_loss: 147.0124 (150.3898) time: 1.4338 (1.7090) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4286 (1.7039) save_time: 8.8805 (25.0095) lr: 0.000069 max mem: 26307 2022-03-16 14:16:22,302.302 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6388888955116272 2022-03-16 14:16:22,302.302 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 164.01275634765625 2022-03-16 14:16:22,302.302 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.98418198742912 2022-03-16 14:16:33,189.189 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018488284200429916 2022-03-16 14:16:33,189.189 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 14:16:33,189.189 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'black', 'dog', 'standing', 'on', 'top', 'of', 'a', 'tile', 'floor', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 14:16:33,204.204 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'room', 'floor', 'table', 'chair', '[UNK]', 'window', 'ceiling', 'picture', 'light', 'television', 'door', 'cabinet', 'mirror', 'shelf', 'couch', 'rug', 'pillow', 'fireplace', 'handle', 'kitchen', 'leg', 'cushion', 'glass', 'lamp', 'curtain', 'paper', 'shade', 'mantle', 'coffee', 'book', 'shirt', 'box', 'sofa', 'living', 'drawer', 'stool', 'carpet', 'clock', 'bag', 'microwave', 'hair', 'man', 'flower', 'remote', 'pot', 'switch', 'head', 'hand', 'building'] 2022-03-16 14:16:49,113.113 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'face', 'room', 'black', 'top', 'door', 'floor', 'wall', 'chest', 'eye', 'wood', 'ring', 'picture', 'dog', 'leg', 'nose', 'ear', 'cabinet', 'mirror', 'ceiling', 'patch', 'collar', 'cart', 'tile', 'stool', 'refrigerator'] 2022-03-16 14:19:12,748.748 2829:trainer.py:487 do_train_dict(): eta: 21:13:40 iter: 20600 speed: 299.8 images/sec total_norm: 134.9857 (136.9441) loss: 142.5376 (145.9681) masked_loss: 1.7123 (1.7453) tag_loss: 141.2660 (144.2227) time: 1.4327 (1.7081) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4275 (1.7029) save_time: 8.8805 (25.0095) lr: 0.000069 max mem: 26307 2022-03-16 14:19:13,111.111 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6470588445663452 2022-03-16 14:19:13,111.111 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 148.69049072265625 2022-03-16 14:19:13,111.111 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.97640001601067 2022-03-16 14:19:24,117.117 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018558017909526825 2022-03-16 14:19:24,118.118 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 14:19:24,118.118 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'bent', '##o', 'boxes', '[MASK]', 'a', 'variety', 'of', 'healthy', 'foods', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 14:19:24,134.134 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['carrot', 'table', 'cheese', 'food', 'tomato', 'grape', 'meat', '[UNK]', 'container', 'candy', 'bowl', 'star', 'slice', 'fruit', 'lemon', 'box', 'cookie', 'sausage', 'dish', 'plastic', 'vegetable', 'orange', 'mushroom', 'nut', 'lunch', 'tray', 'cake', 'bread', 'onion', 'dessert', 'flower', 'lid', 'potato', 'plate', 'bean', 'fork', 'ball', 'stem', 'face', 'cloth', 'different', 'paper', 'almond', 'handle', 'logo', 'egg', 'sandwich', 'piece', 'banana', 'pea'] 2022-03-16 14:19:40,103.103 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'star', 'table', 'food', 'box', 'variety', 'piece', 'meat', 'healthy', 'cheese', 'candy', 'container', 'slice', 'grape', 'mushroom', 'tomato', 'sausage', 'carrot'] 2022-03-16 14:22:03,630.630 2829:trainer.py:487 do_train_dict(): eta: 21:11:02 iter: 20700 speed: 299.6 images/sec total_norm: 134.2216 (136.0428) loss: 151.2389 (152.7164) masked_loss: 1.5952 (1.6751) tag_loss: 149.5527 (151.0414) time: 1.4329 (1.7088) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4278 (1.7036) save_time: 8.8805 (25.0095) lr: 0.000069 max mem: 26307 2022-03-16 14:22:03,991.991 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6470588445663452 2022-03-16 14:22:03,991.991 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 147.8448486328125 2022-03-16 14:22:03,991.991 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.97490704976596 2022-03-16 14:22:15,018.018 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018558382987976074 2022-03-16 14:22:15,018.018 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 14:22:15,019.019 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'small', 'cake', '[MASK]', 'some', 'candles', 'on', 'a', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 14:22:15,034.034 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['candle', 'cake', 'table', 'wall', '[UNK]', 'birthday', 'holder', 'cloth', 'curtain', 'box', 'room', 'glass', 'chair', 'writing', 'flame', 'paper', 'napkin', 'cardboard', 'plate', 'door', 'window', 'blue', 'hair', 'fork', 'shirt', 'knife', 'person', 'carpet', 'tray', 'sign', 'floor', 'top', 'cookie', 'base', 'book', 'woman', 'handle', 'word', 'stand', 'dress', 'flower', 'display', 'cup', 'ceiling', 'star', 'hand', 'card', 'spoon', 'decoration', 'bottle'] 2022-03-16 14:22:31,011.011 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'small', 'door', 'table', 'wall', 'chair', 'paper', 'box', 'sign', 'picture', 'frame', 'handle', 'plate', 'pole', 'cloth', 'flame', 'holder', 'lighter', 'cake', 'curtain', 'necklace', 'outlet', 'candle'] 2022-03-16 14:24:54,607.607 2829:trainer.py:487 do_train_dict(): eta: 21:08:25 iter: 20800 speed: 299.5 images/sec total_norm: 137.4580 (138.7475) loss: 152.2684 (152.1884) masked_loss: 1.7015 (1.6724) tag_loss: 150.7429 (150.5159) time: 1.4337 (1.7098) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4285 (1.7047) save_time: 8.8805 (25.0095) lr: 0.000069 max mem: 26307 2022-03-16 14:24:54,968.968 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.529411792755127 2022-03-16 14:24:54,968.968 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 180.63795471191406 2022-03-16 14:24:54,968.968 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.96078127993351 2022-03-16 14:25:06,129.129 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01858743093907833 2022-03-16 14:25:06,129.129 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 14:25:06,130.130 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'pretty', '[MASK]', 'with', 'some', 'soup', 'in', 'it', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 14:25:06,145.145 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['bowl', 'table', 'plate', 'soup', '[UNK]', 'spoon', 'design', 'food', 'onion', 'glass', 'cup', 'shadow', 'handle', 'vegetable', 'leaf', 'flower', 'napkin', 'white', 'sauce', 'line', 'pepper', 'dish', 'cheese', 'meat', 'pea', 'lemon', 'fork', 'carrot', 'salad', 'fish', 'pasta', 'mushroom', 'shrimp', 'knife', 'reflection', 'potato', 'chicken', 'bread', 'rice', 'cloth', 'herb', 'egg', 'bowls', 'green', 'water', 'top', 'fruit', 'logo', 'cream', 'orange'] 2022-03-16 14:25:22,062.062 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'design', 'table', 'food', 'glass', 'pretty', 'bowl', 'plate', 'flower', 'fork', 'soup', 'pepper', 'napkin'] 2022-03-16 14:27:45,681.681 2829:trainer.py:487 do_train_dict(): eta: 21:05:47 iter: 20900 speed: 299.3 images/sec total_norm: 138.3309 (141.0760) loss: 149.7887 (150.5588) masked_loss: 1.6607 (1.6784) tag_loss: 148.1349 (148.8804) time: 1.4333 (1.7107) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4283 (1.7057) save_time: 8.8805 (25.0095) lr: 0.000069 max mem: 26307 2022-03-16 14:27:46,044.044 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5625 2022-03-16 14:27:46,045.045 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 169.3606719970703 2022-03-16 14:27:46,045.045 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 69.9681798480806 2022-03-16 14:27:57,147.147 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018563518300652504 2022-03-16 14:27:57,147.147 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 14:27:57,148.148 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'motorcycle', 'rider', 'driving', 'down', '[MASK]', 'referring', 'dirt', 'road', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 14:27:57,163.163 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'road', 'bush', 'motorcycle', 'helmet', 'forest', 'wood', 'man', 'person', 'jacket', '[UNK]', 'tire', 'path', 'branch', 'grass', 'windshield', 'bike', 'trunk', 'trail', 'wheel', 'ground', 'vehicle', 'mirror', 'dirt', 'leaf', 'car', 'wooded', 'head', 'light', 'truck', 'line', 'track', 'hat', 'hill', 'plant', 'rock', 'side', 'shirt', 'glove', 'group', 'sky', 'area', 'sign', 'brush', 'country', 'bag', 'pole', 'small', 'wall', 'biker'] 2022-03-16 14:28:13,101.101 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'road', 'person', 'forest', 'tree', 'wood', 'trail', 'mirror', 'bush', 'dirt', 'rider', 'flame', 'motorcycle', 'helmet', 'tire', 'glove', 'wooded'] 2022-03-16 14:30:36,960.960 2829:trainer.py:487 do_train_dict(): eta: 21:03:10 iter: 21000 speed: 298.9 images/sec total_norm: 134.4053 (138.1352) loss: 148.8923 (150.7054) masked_loss: 1.5951 (1.6500) tag_loss: 147.4959 (149.0553) time: 1.4347 (1.7128) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4297 (1.7077) save_time: 8.8805 (25.0095) lr: 0.000068 max mem: 26307 2022-03-16 14:30:37,320.320 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6129032373428345 2022-03-16 14:30:37,320.320 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 126.22579956054688 2022-03-16 14:30:37,321.321 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.0021768181245 2022-03-16 14:30:48,433.433 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018592912703752518 2022-03-16 14:30:48,433.433 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 14:30:48,434.434 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'is', 'squat', '##ting', 'outside', 'by', 'some', '[MASK]', '##s', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 14:30:48,449.449 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['window', 'wall', 'building', '[UNK]', 'flower', 'blind', 'sidewalk', 'brick', 'shoe', 'hand', 'man', 'leg', 'plant', 'head', 'arm', 'ground', 'hair', 'shadow', 'person', 'jacket', 'step', 'ledge', 'line', 'face', 'black', 'curb', 'woman', 'shirt', 'pole', 'block', 'jean', 'bar', 'door', 'white', 'leaf', 'coat', 'sign', 'front', 'hat', 'bag', 'road', 'wheel', 'umbrella', 'handle', 'cap', 'light', 'pot', 'base', 'bicycle', 'street'] 2022-03-16 14:31:04,460.460 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'face', 'building', 'woman', 'short', 'girl', 'outside', 'person', 'wall', 'arm', 'plant', 'window', 'step', 'watch', 'box', 'shirt', 'bag', 'camera', 'handle', 'hat', 'blind', 'tag', 'flower', 'hood', 'arrow', 'boot', 'sidewalk', 'backpack', 'suitcase', 'strap', 'luggage'] 2022-03-16 14:33:28,104.104 2829:trainer.py:487 do_train_dict(): eta: 21:00:32 iter: 21100 speed: 299.2 images/sec total_norm: 132.8051 (134.6574) loss: 153.2119 (152.9291) masked_loss: 1.6315 (1.6781) tag_loss: 151.5865 (151.2510) time: 1.4343 (1.7115) data: 0.0001 (0.0005) to_device: 0.0050 (0.0048) time_gpu: 1.4294 (1.7061) save_time: 8.8805 (25.0095) lr: 0.000068 max mem: 26307 2022-03-16 14:33:28,465.465 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6060606241226196 2022-03-16 14:33:28,465.465 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 131.84347534179688 2022-03-16 14:33:28,465.465 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.01844263976474 2022-03-16 14:33:39,667.667 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01859108731150627 2022-03-16 14:33:39,668.668 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 14:33:39,668.668 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'person', 'riding', 'banking', 'snow', '##board', 'on', 'a', 'mountain', 'slope', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 14:33:39,683.683 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'jacket', 'man', 'snow', 'ground', 'person', 'glove', 'head', 'arm', 'coat', 'hand', 'helmet', 'board', 'face', 'sky', 'leg', 'hill', 'tree', 'mountain', 'slope', 'boot', 'hood', 'foot', 'hat', 'yellow', 'track', 'skier', 'pole', 'air', 'sleeve', 'design', 'snowy', 'logo', 'cloud', 'hair', 'ski', 'stripe', 'tag', 'line', 'background', 'strap', 'black', 'steep', 'shadow', 'side', 'patch', 'backpack', 'boy', 'scarf', 'shoe'] 2022-03-16 14:33:55,587.587 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'face', 'ground', 'board', 'person', 'mountain', 'sky', 'snow', 'coat', 'cloud', 'jacket', 'slope', 'helmet'] 2022-03-16 14:36:19,412.412 2829:trainer.py:487 do_train_dict(): eta: 20:57:54 iter: 21200 speed: 298.9 images/sec total_norm: 133.5796 (136.8119) loss: 149.9103 (151.3754) masked_loss: 1.6800 (1.6548) tag_loss: 148.4138 (149.7206) time: 1.4337 (1.7130) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4285 (1.7079) save_time: 8.8805 (25.0095) lr: 0.000068 max mem: 26307 2022-03-16 14:36:19,772.772 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.65625 2022-03-16 14:36:19,772.772 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 124.99519348144531 2022-03-16 14:36:19,773.773 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.02488352099495 2022-03-16 14:36:31,032.032 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018592767417430878 2022-03-16 14:36:31,032.032 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 14:36:31,032.032 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'pair', 'of', 'head', '##phones', 'in', 'a', 'package', 'on', 'a', 'table', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 14:36:31,048.048 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['scissors', 'table', 'handle', '[UNK]', 'blade', 'cord', 'light', 'box', 'tape', 'pair', 'desk', 'plastic', 'floor', 'wall', 'wire', 'base', 'chair', 'container', 'case', 'drawer', 'strap', 'blue', 'leg', 'paper', 'top', 'stand', 'display', 'cloth', 'wooden', 'string', 'pen', 'door', 'person', 'book', 'band', 'lid', 'laptop', 'cabinet', 'screen', 'bag', 'computer', 'white', 'tray', 'open', 'screw', 'knife', 'button', 'phone', 'hole', 'vent'] 2022-03-16 14:36:46,982.982 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'light', 'person', 'table', 'box', 'jean', 'pair', 'handle', 'blade', 'tape', 'package', 'cord', 'scissors'] 2022-03-16 14:39:10,647.647 2829:trainer.py:487 do_train_dict(): eta: 20:55:17 iter: 21300 speed: 299.0 images/sec total_norm: 136.7047 (139.6794) loss: 150.6798 (150.1393) masked_loss: 1.6671 (1.6495) tag_loss: 149.0127 (148.4899) time: 1.4341 (1.7124) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4289 (1.7073) save_time: 8.8805 (25.0095) lr: 0.000068 max mem: 26307 2022-03-16 14:39:11,009.009 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6176470518112183 2022-03-16 14:39:11,010.010 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 130.20590209960938 2022-03-16 14:39:11,010.010 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.03436915673942 2022-03-16 14:39:22,297.297 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01860078237950802 2022-03-16 14:39:22,297.297 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 14:39:22,298.298 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'bus', 'pulled', 'up', 'to', '[MASK]', 'empty', 'bus', 'stop', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 14:39:22,313.313 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['window', 'bus', 'windshield', 'building', 'sky', 'sign', 'front', 'fence', 'flag', 'road', '[UNK]', 'light', 'street', 'pole', 'number', 'door', 'mirror', 'railing', 'wheel', 'tire', 'roof', 'banner', 'license', 'sidewalk', 'plate', 'line', 'driver', 'advertisement', 'rail', 'car', 'rack', 'letter', 'stripe', 'logo', 'wall', 'white', 'tree', 'post', 'bumper', 'stop', 'ground', 'person', 'man', 'large', 'top', 'city', 'arrow', 'bike', 'floor', 'water'] 2022-03-16 14:39:38,222.222 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'can', 'number', 'public', 'water', 'building', 'door', 'road', 'front', 'street', 'ground', 'stop', 'person', 'window', 'sky', 'bus', 'empty', 'roof', 'flag', 'wheel', 'mirror', 'cloud', 'pole', 'arrow', 'fence', 'tent', 'banner', 'trash', 'railing', 'stripe', 'windshield'] 2022-03-16 14:42:01,951.951 2829:trainer.py:487 do_train_dict(): eta: 20:52:39 iter: 21400 speed: 298.9 images/sec total_norm: 134.8360 (137.4399) loss: 149.9980 (152.7794) masked_loss: 1.6302 (1.6743) tag_loss: 148.6584 (151.1051) time: 1.4336 (1.7131) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4287 (1.7079) save_time: 8.8805 (25.0095) lr: 0.000068 max mem: 26307 2022-03-16 14:42:02,312.312 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6000000238418579 2022-03-16 14:42:02,312.312 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 149.53526306152344 2022-03-16 14:42:02,312.312 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.03504282042037 2022-03-16 14:42:13,770.770 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018604183569550514 2022-03-16 14:42:13,770.770 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 14:42:13,770.770 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'plate', 'holding', 'a', 'pizza', 'next', 'to', 'book', 'and', 'glass', 'of', 'wine', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 14:42:13,786.786 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['pizza', 'glass', 'table', 'plate', 'crust', 'wine', 'base', 'fork', 'book', 'slice', '[UNK]', 'stem', 'letter', 'knife', 'napkin', 'handle', 'newspaper', 'cheese', 'bubble', 'paper', 'topping', 'picture', 'reflection', 'pie', 'drink', 'top', 'writing', 'menu', 'food', 'chicken', 'magazine', 'meat', 'next', 'bird', 'ice', 'water', 'olive', 'bottom', 'shadow', 'white', 'foam', 'bottle', 'cup', 'piece', 'red', 'tray', 'bacon', 'person', 'spoon', 'logo'] 2022-03-16 14:42:29,695.695 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['white', 'book', 'table', 'base', 'writing', 'glass', 'newspaper', 'wine', 'plate', 'fork', 'pizza', 'slice'] 03-16 14:42:41.837 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 14:42:41.837 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 14:42:43.089 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 14:44:53,140.140 2829:trainer.py:487 do_train_dict(): eta: 20:50:01 iter: 21500 speed: 299.1 images/sec total_norm: 134.1793 (138.6764) loss: 149.8346 (152.7388) masked_loss: 1.6058 (1.6191) tag_loss: 147.7528 (151.1197) time: 1.4330 (1.7119) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4279 (1.7068) save_time: 8.8805 (25.0095) lr: 0.000068 max mem: 26307 2022-03-16 14:44:53,501.501 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6285714507102966 2022-03-16 14:44:53,501.501 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 158.74705505371094 2022-03-16 14:44:53,502.502 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.04239709288986 2022-03-16 14:45:05,011.011 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018586475402116776 2022-03-16 14:45:05,011.011 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 14:45:05,012.012 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'bathroom', 'scene', 'looking', 'at', '[MASK]', 'sink', 'and', 'the', 'toilet', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 14:45:05,027.027 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'toilet', 'bathroom', 'lid', 'seat', 'floor', 'handle', '[UNK]', 'door', 'tank', 'sink', 'rack', 'holder', 'paper', 'bar', 'tile', 'towel', 'cabinet', 'base', 'knob', 'bowl', 'box', 'rod', 'can', 'shelf', 'white', 'light', 'outlet', 'drain', 'pipe', 'curtain', 'ceiling', 'mirror', 'small', 'shower', 'roll', 'bottle', 'water', 'trash', 'reflection', 'vent', 'bag', 'cover', 'window', 'soap', 'drawer', 'rug', 'tissue', 'switch', 'frame'] 2022-03-16 14:45:21,006.006 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'top', 'door', 'light', 'wall', 'scene', 'tank', 'handle', 'cabinet', 'bathroom', 'sink', 'towel', 'shelf', 'toilet', 'lid', 'rack'] 2022-03-16 14:47:44,505.505 2829:trainer.py:487 do_train_dict(): eta: 20:47:23 iter: 21600 speed: 298.8 images/sec total_norm: 134.3220 (136.0565) loss: 150.1140 (151.6686) masked_loss: 1.5860 (1.6312) tag_loss: 148.1234 (150.0374) time: 1.4330 (1.7136) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4278 (1.7085) save_time: 8.8805 (25.0095) lr: 0.000067 max mem: 26307 2022-03-16 14:47:44,866.866 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5142857432365417 2022-03-16 14:47:44,866.866 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 161.08016967773438 2022-03-16 14:47:44,866.866 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.06031888970581 2022-03-16 14:47:56,399.399 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018601270392537117 2022-03-16 14:47:56,399.399 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 14:47:56,400.400 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'the', 'efficiency', 'apartment', '[MASK]', 'a', 'living', 'room', ',', 'dining', 'room', 'and', 'kitchen', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 14:47:56,415.415 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['table', 'room', 'pillow', 'couch', 'lamp', 'window', 'curtain', 'floor', 'wall', 'shade', 'chair', 'sofa', 'ceiling', 'living', 'coffee', 'leg', 'television', 'cushion', 'carpet', 'picture', '[UNK]', 'door', 'mirror', 'light', 'book', 'end', 'armchair', 'plant', 'vase', 'flower', 'furniture', 'blanket', 'frame', 'drawer', 'vent', 'bowl', 'cabinet', 'rug', 'glass', 'large', 'base', 'desk', 'pot', 'painting', 'top', 'fireplace', 'arm', 'dresser', 'stand', 'shelf'] 2022-03-16 14:48:12,368.368 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['room', 'book', 'living', 'floor', 'table', 'wall', 'chair', 'window', 'kitchen', 'picture', 'coffee', 'apartment', 'bowl', 'desk', 'frame', 'plate', 'cabinet', 'ceiling', 'couch', 'flower', 'efficiency', 'shade', 'pillow', 'carpet', 'lamp', 'sofa', 'curtain', 'vase', 'cushion', 'jug'] 2022-03-16 14:50:35,859.859 2829:trainer.py:487 do_train_dict(): eta: 20:44:45 iter: 21700 speed: 298.8 images/sec total_norm: 136.5767 (140.0194) loss: 150.2529 (151.9859) masked_loss: 1.6638 (1.6895) tag_loss: 148.2034 (150.2964) time: 1.4324 (1.7135) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4272 (1.7084) save_time: 8.8805 (25.0095) lr: 0.000067 max mem: 26307 2022-03-16 14:50:36,219.219 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-16 14:50:36,219.219 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 168.2318878173828 2022-03-16 14:50:36,220.220 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.04952903187603 2022-03-16 14:50:47,674.674 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018620865419507027 2022-03-16 14:50:47,676.676 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 14:50:47,676.676 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'person', 'and', 'rosario', '##board', 'in', 'the', 'air', 'in', 'an', '[MASK]', 'area', 'with', 'posts', 'and', 'graffiti', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 14:50:47,691.691 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', '[UNK]', 'arm', 'man', 'ground', 'head', 'leg', 'hat', 'person', 'hand', 'face', 'sign', 'short', 'board', 'floor', 'foot', 'shoe', 'tree', 'wall', 'light', 'boy', 'wheel', 'jean', 'poster', 'building', 'background', 'shadow', 'air', 'trick', 'picture', 'graffiti', 'cap', 'pad', 'logo', 'woman', 'street', 'ramp', 'knee', 'sky', 'pool', 'band', 'belt', 'window', 'skate', 'pole', 'reflection', 'banner', 'line', 'ball', 'ceiling'] 2022-03-16 14:51:03,611.611 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'area', 'hand', 'face', 'air', 'building', 'ground', 'person', 'wall', 'arm', 'base', 'window', 'sign', 'shirt', 'picture', 'leg', 'bottle', 'ceiling', 'column', 'hat', 'cap', 'indoor', 'pad', 'pillar', 'graffiti'] 2022-03-16 14:53:27,334.334 2829:trainer.py:487 do_train_dict(): eta: 20:42:07 iter: 21800 speed: 298.6 images/sec total_norm: 134.2628 (138.1374) loss: 154.3311 (152.8688) masked_loss: 1.7063 (1.7285) tag_loss: 152.8481 (151.1403) time: 1.4334 (1.7148) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4287 (1.7098) save_time: 8.8805 (25.0095) lr: 0.000067 max mem: 26307 2022-03-16 14:53:27,696.696 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5833333134651184 2022-03-16 14:53:27,697.697 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 151.42822265625 2022-03-16 14:53:27,697.697 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.0558759959321 2022-03-16 14:53:39,198.198 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018735645338892937 2022-03-16 14:53:39,199.199 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 14:53:39,200.200 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'person', 'standing', 'on', 'a', 'tennis', 'recipes', 'holding', 'a', 'tennis', 'ra', '##c', '##quet', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 14:53:39,215.215 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['line', 'shoe', 'fence', 'shirt', 'sock', 'short', 'court', '[UNK]', 'pole', 'tennis', 'leg', 'man', 'tree', 'hand', 'post', 'arm', 'ball', 'head', 'person', 'grass', 'ground', 'hair', 'bush', 'knee', 'boy', 'player', 'top', 'handle', 'air', 'woman', 'face', 'hat', 'leaf', 'dirt', 'sky', 'foot', 'stripe', 'cap', 'bat', 'game', 'young', 'logo', 'roof', 'bench', 'plant', 'tank', 'net', 'match', 'swing', 'wall'] 2022-03-16 14:53:55,095.095 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'hand', 'line', 'court', 'short', 'hair', 'post', 'person', 'arm', 'foot', 'tree', 'shirt', 'leg', 'tennis', 'bush', 'pole', 'fence', 'shoe', 'sock'] 2022-03-16 14:56:18,829.829 2829:trainer.py:487 do_train_dict(): eta: 20:39:29 iter: 21900 speed: 298.6 images/sec total_norm: 135.4611 (138.8950) loss: 152.2609 (153.3947) masked_loss: 1.6345 (1.6399) tag_loss: 150.8832 (151.7547) time: 1.4332 (1.7149) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4280 (1.7097) save_time: 8.8805 (25.0095) lr: 0.000067 max mem: 26307 2022-03-16 14:56:19,188.188 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5757575631141663 2022-03-16 14:56:19,189.189 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 126.06427001953125 2022-03-16 14:56:19,189.189 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.10279261849143 2022-03-16 14:56:30,828.828 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018765972927212715 2022-03-16 14:56:30,828.828 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 14:56:30,828.828 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'baseball', 'player', 'holding', 'a', 'bat', 'while', 'standing', '[MASK]', 'a', 'field', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 14:56:30,844.844 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'helmet', 'field', 'line', 'catcher', 'man', 'shirt', 'grass', 'glove', 'bat', 'wall', 'dirt', 'uniform', 'batter', 'umpire', 'shoe', 'player', 'baseball', 'mask', 'leg', 'shadow', 'plate', 'stand', 'home', 'person', 'head', 'fence', 'game', 'hat', 'ground', 'guard', 'belt', 'ball', 'sign', 'jersey', 'hand', 'advertisement', 'banner', 'shin', 'crowd', 'logo', 'spectator', 'railing', 'ready', 'camera', 'pitch', 'cap', 'number', 'cooler', 'base'] 2022-03-16 14:56:46,747.747 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'home', 'line', 'player', 'field', 'person', 'wall', 'base', 'stand', 'baseball', 'shirt', 'leg', 'crowd', 'plate', 'shadow', 'grass', 'belt', 'hat', 'uniform', 'dirt', 'bat', 'mask', 'helmet', 'shoe', 'catcher', 'glove', 'umpire', 'batter'] 2022-03-16 14:59:10,228.228 2829:trainer.py:487 do_train_dict(): eta: 20:36:51 iter: 22000 speed: 298.7 images/sec total_norm: 135.7891 (138.4069) loss: 153.4985 (153.1872) masked_loss: 1.6509 (1.6695) tag_loss: 151.6041 (151.5176) time: 1.4330 (1.7140) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4277 (1.7089) save_time: 8.8805 (25.0095) lr: 0.000067 max mem: 26307 2022-03-16 14:59:10,590.590 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5625 2022-03-16 14:59:10,590.590 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 142.63137817382812 2022-03-16 14:59:10,590.590 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.11265986861147 2022-03-16 14:59:22,343.343 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018753327429294586 2022-03-16 14:59:22,343.343 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 14:59:22,343.343 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'number', 'of', 'people', '[MASK]', 'a', 'beach', 'with', 'many', '[MASK]', '##s', 'flying', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 14:59:22,359.359 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['kite', 'sky', 'person', 'sand', 'mountain', 'beach', 'cloud', 'short', 'string', 'hill', 'man', 'shirt', 'water', 'woman', '[UNK]', 'air', 'hat', 'bag', 'building', 'house', 'tail', 'tent', 'head', 'couple', 'wave', 'chair', 'leg', 'hair', 'tree', 'group', 'parachute', 'umbrella', 'rope', 'logo', 'shadow', 'child', 'top', 'day', 'sandy', 'boy', 'foot', 'balloon', 'footprint', 'blanket', 'grass', 'boat', 'flag', 'bikini', 'ground', 'ocean'] 2022-03-16 14:59:38,391.391 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'many', 'man', 'number', 'air', 'short', 'ground', 'person', 'hill', 'mountain', 'beach', 'sky', 'shirt', 'string', 'sand', 'cloud', 'kite'] 2022-03-16 15:02:01,945.945 2829:trainer.py:487 do_train_dict(): eta: 20:34:13 iter: 22100 speed: 298.2 images/sec total_norm: 134.0390 (137.1120) loss: 150.2495 (151.1519) masked_loss: 1.6509 (1.6876) tag_loss: 148.2164 (149.4642) time: 1.4338 (1.7172) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4287 (1.7121) save_time: 8.8805 (25.0095) lr: 0.000067 max mem: 26307 2022-03-16 15:02:02,305.305 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5714285969734192 2022-03-16 15:02:02,306.306 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 134.83810424804688 2022-03-16 15:02:02,306.306 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.12876385181873 2022-03-16 15:02:13,944.944 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018755590543150902 2022-03-16 15:02:13,945.945 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 15:02:13,945.945 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'the', '[MASK]', 'is', 'interested', 'in', 'what', 'is', '[MASK]', 'on', 'his', 'phone', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 15:02:13,960.960 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'man', 'sky', 'hand', 'tree', 'railing', 'hair', 'head', 'phone', 'ear', 'water', 'short', 'grass', 'fence', '[UNK]', 'foil', 'eye', 'face', 'floor', 'nose', 'cell', 'rail', 'person', 'collar', 'mouth', 'pole', 'balcony', 'post', 'reflection', 'building', 'arm', 'camera', 'table', 'bush', 'park', 'leg', 'deck', 'handle', 'chair', 'finger', 'bench', 'porch', 'bottle', 'next', 'window', 'leaf', 'sidewalk', 'white', 'metal', 'top'] 2022-03-16 15:02:29,833.833 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'hand', 'face', 'water', 'short', 'hair', 'floor', 'phone', 'tree', 'sky', 'shirt', 'ear', 'interested', 'porch', 'fence', 'railing', 'foil'] 2022-03-16 15:04:53,700.700 2829:trainer.py:487 do_train_dict(): eta: 20:31:35 iter: 22200 speed: 298.1 images/sec total_norm: 135.2420 (137.9736) loss: 148.1449 (148.5095) masked_loss: 1.6983 (1.6648) tag_loss: 145.9914 (146.8447) time: 1.4336 (1.7176) data: 0.0001 (0.0005) to_device: 0.0051 (0.0050) time_gpu: 1.4285 (1.7121) save_time: 8.8805 (25.0095) lr: 0.000067 max mem: 26307 2022-03-16 15:04:54,061.061 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5151515007019043 2022-03-16 15:04:54,062.062 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 135.98159790039062 2022-03-16 15:04:54,062.062 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.14292232017345 2022-03-16 15:05:05,908.908 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018771588802337646 2022-03-16 15:05:05,909.909 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 15:05:05,909.909 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'pup', '##pies', 'playing', 'in', 'the', 'green', 'grass', 'of', 'their', 'yard', '[MASK]', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 15:05:05,924.924 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['grass', 'dog', 'collar', 'leg', 'tail', 'head', 'ear', '[UNK]', 'neck', 'paw', 'mouth', 'field', 'face', 'foot', 'nose', 'eye', 'spot', 'fence', 'ground', 'person', 'grassy', 'back', 'air', 'white', 'tag', 'green', 'wall', 'black', 'bush', 'fur', 'leaf', 'flower', 'legs', 'pole', 'tree', 'group', 'trunk', 'small', 'body', 'patch', 'dirt', 'park', 'top', 'chest', 'other', 'house', 'yard', 'hair', 'post', 'toy'] 2022-03-16 15:05:21,862.862 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'face', 'green', 'mouth', 'neck', 'dog', 'leg', 'yard', 'ear', 'grass', 'tail', 'collar', 'paw'] 2022-03-16 15:07:45,421.421 2829:trainer.py:487 do_train_dict(): eta: 20:28:57 iter: 22300 speed: 298.2 images/sec total_norm: 137.2633 (138.9839) loss: 149.4451 (149.7311) masked_loss: 1.5724 (1.6184) tag_loss: 147.5350 (148.1127) time: 1.4330 (1.7172) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4279 (1.7122) save_time: 8.8805 (25.0095) lr: 0.000066 max mem: 26307 2022-03-16 15:07:45,782.782 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.44117647409439087 2022-03-16 15:07:45,782.782 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 149.8798828125 2022-03-16 15:07:45,782.782 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.1530785730907 2022-03-16 15:07:57,543.543 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018815239891409874 2022-03-16 15:07:57,543.543 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 15:07:57,544.544 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'the', 'hitter', 'prepares', 'to', 'get', '[MASK]', 'the', 'batter', '##s', 'box', 'for', 'the', 'pitch', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 15:07:57,559.559 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'stand', 'shirt', 'man', '[UNK]', 'helmet', 'seat', 'shoe', 'woman', 'game', 'hat', 'person', 'player', 'stadium', 'field', 'number', 'uniform', 'catcher', 'baseball', 'line', 'bat', 'glove', 'umpire', 'jersey', 'cap', 'stair', 'chair', 'mask', 'grass', 'dirt', 'spectator', 'fence', 'head', 'batter', 'hair', 'hand', 'sunglasses', 'boy', 'bag', 'glasses', 'leg', 'barrier', 'ground', 'railing', 'step', 'ball', 'logo', 'belt', 'plate', 'base'] 2022-03-16 15:08:13,476.476 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'hand', 'number', 'game', 'line', 'player', 'woman', 'field', 'person', 'child', 'wall', 'seat', 'stand', 'chair', 'stadium', 'baseball', 'ball', 'shirt', 'jersey', 'bag', 'hat', 'uniform', 'pitch', 'bat', 'mask', 'glasses', 'helmet', 'shoe', 'catcher', 'glove', 'hitter', 'umpire', 'spectator', 'stair'] 2022-03-16 15:10:37,248.248 2829:trainer.py:487 do_train_dict(): eta: 20:26:19 iter: 22400 speed: 298.0 images/sec total_norm: 135.1457 (138.7956) loss: 151.7771 (151.7263) masked_loss: 1.5879 (1.6285) tag_loss: 150.1920 (150.0979) time: 1.4338 (1.7183) data: 0.0001 (0.0002) to_device: 0.0050 (0.0050) time_gpu: 1.4286 (1.7132) save_time: 8.8805 (25.0095) lr: 0.000066 max mem: 26307 2022-03-16 15:10:37,608.608 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-16 15:10:37,608.608 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 144.73355102539062 2022-03-16 15:10:37,609.609 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.16920547485351 2022-03-16 15:10:49,522.522 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01884065568447113 2022-03-16 15:10:49,522.522 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 15:10:49,522.522 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'astronomer', 'baseball', 'player', 'is', 'successful', 'in', 'his', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 15:10:49,538.538 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', '[UNK]', 'sky', 'shoe', 'man', 'bat', 'dirt', 'pole', 'baseball', 'helmet', 'uniform', 'fence', 'player', 'belt', 'plate', 'person', 'leg', 'grass', 'field', 'arm', 'ground', 'batter', 'home', 'hand', 'bridge', 'head', 'building', 'line', 'glove', 'base', 'game', 'light', 'tree', 'ball', 'catcher', 'jersey', 'sign', 'umpire', 'cloud', 'boy', 'hat', 'foot', 'roof', 'short', 'ready', 'tower', 'swing', 'bench', 'mask', 'number'] 2022-03-16 15:11:05,559.559 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'home', 'hand', 'player', 'field', 'ground', 'person', 'bridge', 'successful', 'stadium', 'tree', 'attempt', 'baseball', 'ball', 'sign', 'sky', 'shirt', 'plate', 'grass', 'belt', 'uniform', 'pole', 'dirt', 'bat', 'wire', 'logo', 'fence', 'helmet', 'shoe'] 03-16 15:12:43.178 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 15:12:43.178 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 15:12:44.354 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 95}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 96}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 97}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 15:13:28,953.953 2829:trainer.py:487 do_train_dict(): eta: 20:23:41 iter: 22500 speed: 298.2 images/sec total_norm: 137.7139 (139.5286) loss: 151.3708 (150.3748) masked_loss: 1.6112 (1.6567) tag_loss: 149.7876 (148.7181) time: 1.4333 (1.7171) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4281 (1.7120) save_time: 8.8805 (25.0095) lr: 0.000066 max mem: 26307 2022-03-16 15:13:29,315.315 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7142857313156128 2022-03-16 15:13:29,315.315 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 161.44058227539062 2022-03-16 15:13:29,315.315 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.17193963042402 2022-03-16 15:13:41,138.138 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01885174959897995 2022-03-16 15:13:41,139.139 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 15:13:41,139.139 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'plate', '[MASK]', 'rolls', 'and', 'some', '[MASK]', '##tu', '##ce', 'with', 'a', '[MASK]', 'on', 'the', 'side', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 15:13:41,154.154 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['table', '[UNK]', 'bread', 'sandwich', 'plate', 'salad', 'glass', 'food', 'bowl', 'cup', 'tray', 'fork', 'napkin', 'person', 'handle', 'container', 'knife', 'bottle', 'leaf', 'wall', 'chair', 'water', 'bun', 'meat', 'tomato', 'floor', 'basket', 'spoon', 'liquid', 'lid', 'carrot', 'hand', 'label', 'window', 'top', 'paper', 'lemon', 'ground', 'green', 'onion', 'shirt', 'shadow', 'wine', 'pepper', 'drink', 'flower', 'tile', 'light', 'logo', 'arm'] 2022-03-16 15:13:57,079.079 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'side', 'top', 'table', 'food', 'chair', 'drink', 'handle', 'plate', 'bottle', 'liquid', 'bread', 'fork', 'sandwich', 'container', 'lid', 'jar', 'salad', 'napkin'] 2022-03-16 15:16:20,846.846 2829:trainer.py:487 do_train_dict(): eta: 20:21:03 iter: 22600 speed: 297.9 images/sec total_norm: 139.6925 (141.5385) loss: 151.0261 (151.1999) masked_loss: 1.7646 (1.7453) tag_loss: 149.5672 (149.4547) time: 1.4341 (1.7189) data: 0.0001 (0.0002) to_device: 0.0049 (0.0047) time_gpu: 1.4289 (1.7140) save_time: 8.8805 (25.0095) lr: 0.000066 max mem: 26307 2022-03-16 15:16:21,206.206 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5142857432365417 2022-03-16 15:16:21,207.207 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 141.1366424560547 2022-03-16 15:16:21,207.207 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.16016181239998 2022-03-16 15:16:33,063.063 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01883060298860073 2022-03-16 15:16:33,063.063 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 15:16:33,064.064 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'couple', 'of', 'kids', '[MASK]', 'are', 'in', 'some', 'bags', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 15:16:33,079.079 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['girl', '[UNK]', 'shirt', 'wall', 'hair', 'hand', 'head', 'arm', 'leg', 'eye', 'nose', 'face', 'floor', 'child', 'shoe', 'short', 'fireplace', 'bed', 'food', 'foot', 'sock', 'ear', 'young', 'chair', 'book', 'stripe', 'table', 'pizza', 'boy', 'mouth', 'ponytail', 'plate', 'bag', 'pillow', 'blanket', 'room', 'woman', 'couch', 'box', 'strap', 'carpet', 'hat', 'little', 'picture', 'top', 'fire', 'knee', 'person', 'paper', 'toy'] 2022-03-16 15:16:48,919.919 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'hand', 'face', 'woman', 'hair', 'girl', 'person', 'child', 'bed', 'wall', 'arm', 'couple', 'baby', 'shirt', 'bag', 'ear', 'tag', 'pizza', 'suitcase'] 2022-03-16 15:19:12,955.955 2829:trainer.py:487 do_train_dict(): eta: 20:18:25 iter: 22700 speed: 297.5 images/sec total_norm: 134.8037 (137.6708) loss: 151.8196 (153.4212) masked_loss: 1.5916 (1.6368) tag_loss: 150.1186 (151.7844) time: 1.4348 (1.7211) data: 0.0001 (0.0002) to_device: 0.0050 (0.0050) time_gpu: 1.4296 (1.7159) save_time: 8.8805 (25.0095) lr: 0.000066 max mem: 26307 2022-03-16 15:19:13,317.317 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4117647111415863 2022-03-16 15:19:13,318.318 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 144.55197143554688 2022-03-16 15:19:13,318.318 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.16097405082301 2022-03-16 15:19:25,344.344 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.0188747625797987 2022-03-16 15:19:25,344.344 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 15:19:25,345.345 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'person', '[MASK]', 'a', 'dirt', 'bike', 'on', 'a', '[MASK]', 'trail', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 15:19:25,360.360 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['motorcycle', 'dirt', 'helmet', 'man', 'boot', 'tire', 'bike', 'ground', 'sky', 'person', '[UNK]', 'wheel', 'grass', 'shirt', 'head', 'pole', 'hand', 'cone', 'field', 'leg', 'jacket', 'track', 'vest', 'glove', 'fence', 'rider', 'flag', 'mud', 'fender', 'sign', 'rope', 'arm', 'foot', 'uniform', 'face', 'number', 'tree', 'road', 'banner', 'race', 'post', 'barrier', 'spoke', 'background', 'cloud', 'outfit', 'hill', 'course', 'building', 'shadow'] 2022-03-16 15:19:41,194.194 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'hand', 'ground', 'track', 'person', 'arm', 'sky', 'shirt', 'leg', 'trail', 'chain', 'wheel', 'grass', 'stick', 'pole', 'jacket', 'dirt', 'glasses', 'bike', 'mud', 'fence', 'barrier', 'motorcycle', 'boot', 'helmet', 'tire', 'muddy', 'glove', 'vest'] 2022-03-16 15:22:04,842.842 2829:trainer.py:487 do_train_dict(): eta: 20:15:46 iter: 22800 speed: 297.9 images/sec total_norm: 135.3317 (139.6346) loss: 149.1247 (151.6351) masked_loss: 1.6151 (1.6486) tag_loss: 147.5096 (149.9866) time: 1.4324 (1.7189) data: 0.0001 (0.0002) to_device: 0.0050 (0.0050) time_gpu: 1.4271 (1.7137) save_time: 8.8805 (25.0095) lr: 0.000066 max mem: 26307 2022-03-16 15:22:05,203.203 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6571428775787354 2022-03-16 15:22:05,203.203 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 159.89584350585938 2022-03-16 15:22:05,203.203 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.16715342092722 2022-03-16 15:22:17,211.211 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018884718418121338 2022-03-16 15:22:17,211.211 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 15:22:17,211.211 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'there', 'is', 'engulfed', 'pine', '##apple', 'besides', 'a', 'plate', '[MASK]', 'orange', '##s', 'and', 'a', 'small', 'bowl', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 15:22:17,227.227 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['orange', 'bowl', '[UNK]', 'wall', 'table', 'plant', 'cup', 'fruit', 'leaf', 'stem', 'plate', 'pot', 'container', 'vase', 'red', 'top', 'ground', 'bucket', 'cloth', 'flower', 'apple', 'basket', 'tray', 'straw', 'door', 'bunch', 'mat', 'display', 'bowls', 'handle', 'lid', 'brick', 'paint', 'floor', 'glass', 'design', 'next', 'base', 'candle', 'banana', 'paper', 'small', 'mug', 'lemon', 'carpet', 'sign', 'jar', 'fresh', 'stick', 'stand'] 2022-03-16 15:22:33,177.177 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'small', 'cup', 'ground', 'table', 'wall', 'plant', 'orange', 'bowl', 'plate', 'fruit', 'leaf', 'stem', 'pot', 'container', 'straw', 'bucket'] 2022-03-16 15:24:56,941.941 2829:trainer.py:487 do_train_dict(): eta: 20:13:08 iter: 22900 speed: 297.5 images/sec total_norm: 134.7267 (137.5499) loss: 150.8632 (150.9382) masked_loss: 1.6942 (1.6851) tag_loss: 149.1250 (149.2531) time: 1.4336 (1.7210) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4287 (1.7159) save_time: 8.8805 (25.0095) lr: 0.000066 max mem: 26307 2022-03-16 15:24:57,302.302 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.39393940567970276 2022-03-16 15:24:57,303.303 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 175.9125213623047 2022-03-16 15:24:57,303.303 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.15778335903002 2022-03-16 15:25:09,416.416 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018883822485804558 2022-03-16 15:25:09,416.416 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 15:25:09,417.417 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'cat', 'is', 'sitting', 'on', 'top', '[MASK]', 'the', 'refrigerator', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 15:25:09,432.432 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['refrigerator', 'door', 'window', 'wall', 'handle', '[UNK]', 'pen', 'head', 'cap', 'container', 'ear', 'magnet', 'kitchen', 'paper', 'nose', 'table', 'bottle', 'picture', 'eye', 'face', 'grass', 'curtain', 'cat', 'cabinet', 'train', 'box', 'tree', 'ceiling', 'floor', 'chair', 'mouth', 'frame', 'light', 'shirt', 'hair', 'wire', 'person', 'top', 'pencil', 'lid', 'shelf', 'bag', 'towel', 'cup', 'car', 'microwave', 'woman', 'hand', 'reflection', 'cord'] 2022-03-16 15:25:25,413.413 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'can', 'head', 'room', 'top', 'book', 'door', 'light', 'wall', 'eye', 'paper', 'window', 'box', 'kitchen', 'picture', 'leg', 'ear', 'cat', 'mirror', 'grass', 'ceiling', 'cap', 'basket', 'curtain', 'container', 'lid', 'magnet', 'refrigerator', 'paw'] 2022-03-16 15:27:48,964.964 2829:trainer.py:487 do_train_dict(): eta: 20:10:30 iter: 23000 speed: 297.6 images/sec total_norm: 136.8162 (142.2599) loss: 148.6096 (147.6655) masked_loss: 1.6150 (1.6340) tag_loss: 146.6157 (146.0314) time: 1.4332 (1.7203) data: 0.0001 (0.0001) to_device: 0.0051 (0.0051) time_gpu: 1.4280 (1.7151) save_time: 8.8805 (25.0095) lr: 0.000065 max mem: 26307 2022-03-16 15:27:49,325.325 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-16 15:27:49,326.326 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 161.98287963867188 2022-03-16 15:27:49,326.326 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.14104453413002 2022-03-16 15:28:01,390.390 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01887005940079689 2022-03-16 15:28:01,390.390 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 15:28:01,391.391 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'young', 'kid', '[MASK]', 'laying', 'down', 'and', 'reading', 'a', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 15:28:01,406.406 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hand', 'arm', 'bed', 'book', 'hair', 'head', 'face', 'pillow', 'girl', 'nose', 'blanket', 'eye', 'leg', 'toy', 'shirt', 'bear', '[UNK]', 'boy', 'person', 'ear', 'wall', 'animal', 'woman', 'mouth', 'finger', 'teddy', 'stuffed', 'foot', 'short', 'paw', 'box', 'picture', 'child', 'doll', 'dog', 'young', 'sheet', 'glasses', 'design', 'apple', 'bag', 'bird', 'logo', 'writing', 'dot', 'laptop', 'flower', 'ball', 'floor', 'table'] 2022-03-16 15:28:17,280.280 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['hand', 'book', 'young', 'hair', 'girl', 'person', 'bed', 'arm', 'shirt', 'animal', 'finger', 'ear', 'kid', 'logo', 'blanket', 'toy', 'pillow', 'curtain', 'stuffed'] 2022-03-16 15:30:41,093.093 2829:trainer.py:487 do_train_dict(): eta: 20:07:52 iter: 23100 speed: 297.5 images/sec total_norm: 137.7201 (138.7487) loss: 153.3822 (150.9167) masked_loss: 1.6256 (1.6272) tag_loss: 151.5968 (149.2896) time: 1.4327 (1.7212) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4274 (1.7160) save_time: 8.8805 (25.0095) lr: 0.000065 max mem: 26307 2022-03-16 15:30:41,454.454 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-16 15:30:41,454.454 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 128.96005249023438 2022-03-16 15:30:41,454.454 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.16326327159487 2022-03-16 15:30:53,664.664 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018871325999498367 2022-03-16 15:30:53,664.664 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 15:30:53,665.665 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'herd', 'of', 'sheep', 'near', 'a', 'barn', '[MASK]', 'a', 'mountain', '.', 'kills', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 15:30:53,680.680 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'sky', 'grass', 'barn', 'fence', 'hill', 'sheep', 'field', 'cloud', 'mountain', 'roof', 'building', 'animal', 'herd', 'farm', '[UNK]', 'dirt', 'post', 'house', 'pasture', 'horse', 'grazing', 'cow', 'background', 'rock', 'person', 'grassy', 'head', 'pole', 'green', 'lush', 'ground', 'bush', 'hillside', 'large', 'lamb', 'area', 'open', 'group', 'gate', 'shed', 'dog', 'car', 'door', 'leaf', 'flower', 'shadow', 'goat', 'hay', 'snow'] 2022-03-16 15:31:09,591.591 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['building', 'field', 'hill', 'mountain', 'tree', 'sky', 'animal', 'roof', 'grass', 'cloud', 'sheep', 'fence', 'barn', 'herd'] 2022-03-16 15:33:33,041.041 2829:trainer.py:487 do_train_dict(): eta: 20:05:13 iter: 23200 speed: 297.8 images/sec total_norm: 135.5694 (138.0753) loss: 148.0987 (150.2050) masked_loss: 1.6260 (1.6843) tag_loss: 146.9210 (148.5208) time: 1.4329 (1.7195) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4276 (1.7144) save_time: 8.8805 (25.0095) lr: 0.000065 max mem: 26307 2022-03-16 15:33:33,402.402 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5454545617103577 2022-03-16 15:33:33,403.403 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 138.2624053955078 2022-03-16 15:33:33,403.403 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.16169995811364 2022-03-16 15:33:45,711.711 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018899137154221535 2022-03-16 15:33:45,711.711 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 15:33:45,712.712 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'cattle', 'being', 'herd', '##ed', 'down', 'a', 'trail', 'with', '[MASK]', '[MASK]', 'the', 'distance', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 15:33:45,727.727 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'field', 'grass', 'sky', 'cow', 'water', 'mountain', 'post', 'fence', 'cloud', 'hill', 'river', 'pasture', 'pole', 'head', 'herd', 'leg', 'dirt', '[UNK]', 'bush', 'pond', 'animal', 'smoke', 'road', 'tail', 'group', 'cattle', 'grassy', 'horse', 'path', 'fog', 'distance', 'stream', 'area', 'forest', 'ear', 'lush', 'background', 'rock', 'large', 'person', 'building', 'top', 'ground', 'open', 'next', 'brown', 'shadow', 'green', 'dog'] 2022-03-16 15:34:01,721.721 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'water', 'river', 'field', 'post', 'hill', 'mountain', 'distance', 'tree', 'sky', 'leg', 'trail', 'grass', 'tail', 'cloud', 'dirt', 'fence', 'pond', 'cow', 'pasture'] 2022-03-16 15:36:25,157.157 2829:trainer.py:487 do_train_dict(): eta: 20:02:35 iter: 23300 speed: 297.5 images/sec total_norm: 138.9259 (140.1018) loss: 149.3187 (149.4137) masked_loss: 1.6085 (1.6140) tag_loss: 147.7715 (147.7998) time: 1.4331 (1.7211) data: 0.0001 (0.0005) to_device: 0.0051 (0.0050) time_gpu: 1.4280 (1.7157) save_time: 8.8805 (25.0095) lr: 0.000065 max mem: 26307 2022-03-16 15:36:25,520.520 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5 2022-03-16 15:36:25,520.520 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 138.348876953125 2022-03-16 15:36:25,520.520 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.17541662036864 2022-03-16 15:36:37,742.742 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018946433439850807 2022-03-16 15:36:37,742.742 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 15:36:37,742.742 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'view', 'of', 'a', 'dining', 'room', 'with', 'a', 'chandler', 'above', 'the', 'table', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 15:36:37,758.758 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['chair', 'wall', 'table', 'window', '[UNK]', 'light', 'floor', 'door', 'picture', 'room', 'ceiling', 'curtain', 'glass', 'plant', 'rug', 'bottle', 'vase', 'kitchen', 'dining', 'paper', 'book', 'towel', 'shelf', 'cloth', 'plate', 'candle', 'pot', 'flower', 'napkin', 'stool', 'cabinet', 'blind', 'cushion', 'tile', 'bar', 'basket', 'lamp', 'tray', 'cup', 'coffee', 'area', 'holder', 'switch', 'phone', 'bowl', 'counter', 'mat', 'couch', 'clock', 'fixture'] 2022-03-16 15:36:53,673.673 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'can', 'room', 'door', 'light', 'floor', 'table', 'wall', 'view', 'glass', 'chair', 'plant', 'window', 'picture', 'wine', 'fan', 'bottle', 'ceiling', 'flower', 'dining', 'lamp', 'curtain', 'shelf', 'outlet', 'chandler', 'jar', 'vase', 'rug'] 2022-03-16 15:39:17,450.450 2829:trainer.py:487 do_train_dict(): eta: 19:59:56 iter: 23400 speed: 297.2 images/sec total_norm: 137.9162 (140.4553) loss: 149.7691 (150.4321) masked_loss: 1.5975 (1.5901) tag_loss: 148.1763 (148.8421) time: 1.4351 (1.7230) data: 0.0001 (0.0002) to_device: 0.0052 (0.0051) time_gpu: 1.4299 (1.7177) save_time: 8.8805 (25.0095) lr: 0.000065 max mem: 26307 2022-03-16 15:39:17,810.810 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5 2022-03-16 15:39:17,811.811 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 159.7098388671875 2022-03-16 15:39:17,811.811 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.16895571769552 2022-03-16 15:39:30,042.042 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018980173394083977 2022-03-16 15:39:30,043.043 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 15:39:30,043.043 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'display', 'in', 'a', '[MASK]', 'filled', 'with', 'lots', 'of', 'fresh', 'produce', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 15:39:30,059.059 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'woman', 'ceiling', 'person', 'wall', '[UNK]', 'chair', 'food', 'glasses', 'short', 'light', 'fruit', 'man', 'refrigerator', 'banana', 'shoe', 'hair', 'floor', 'shelf', 'building', 'table', 'top', 'picture', 'window', 'stand', 'bowl', 'ground', 'case', 'sign', 'bottle', 'bag', 'box', 'cooler', 'tank', 'jean', 'display', 'grape', 'poster', 'door', 'cart', 'tray', 'fan', 'stool', 'hat', 'pastry', 'glass', 'shop', 'apple', 'container', 'lady'] 2022-03-16 15:39:46,100.100 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'building', 'top', 'light', 'woman', 'short', 'case', 'ground', 'hair', 'girl', 'person', 'floor', 'table', 'wall', 'food', 'chair', 'bar', 'box', 'sign', 'shirt', 'picture', 'produce', 'scale', 'drink', 'bowl', 'display', 'restaurant', 'fresh', 'tank', 'bottle', 'ceiling', 'fruit', 'hat', 'cap', 'pole', 'glasses', 'rod', 'basket', 'lid', 'poster', 'banana', 'refrigerator'] 2022-03-16 15:42:09,707.707 2829:trainer.py:487 do_train_dict(): eta: 19:57:18 iter: 23500 speed: 297.2 images/sec total_norm: 141.4391 (143.7444) loss: 150.1758 (150.0126) masked_loss: 1.6447 (1.6607) tag_loss: 148.4586 (148.3519) time: 1.4334 (1.7225) data: 0.0002 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4283 (1.7176) save_time: 8.8805 (25.0095) lr: 0.000065 max mem: 26307 2022-03-16 15:42:10,070.070 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.59375 2022-03-16 15:42:10,070.070 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 134.4872283935547 2022-03-16 15:42:10,070.070 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.18996980634786 2022-03-16 15:42:22,513.513 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018972747027873993 2022-03-16 15:42:22,513.513 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 15:42:22,513.513 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'of', 'sheep', 'grazing', 'on', 'a', 'lush', 'green', 'hillside', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 15:42:22,528.528 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['ear', 'leg', 'sheep', 'grass', 'head', 'tree', 'nose', 'field', 'eye', 'sky', 'road', 'face', '[UNK]', 'shadow', 'fence', 'wool', 'pole', 'background', 'lamb', 'mouth', 'building', 'hill', 'cloud', 'house', 'green', 'grassy', 'foot', 'white', 'mountain', 'car', 'standing', 'path', 'bridge', 'dirt', 'sign', 'bush', 'body', 'tail', 'next', 'line', 'baby', 'paint', 'tag', 'front', 'ground', 'roof', 'animal', 'camera', 'window', 'other'] 2022-03-16 15:42:38,475.475 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'face', 'road', 'field', 'green', 'mouth', 'hill', 'eye', 'tree', 'sky', 'leg', 'nose', 'ear', 'grass', 'tail', 'sheep', 'fence', 'wool', 'lamb', 'herd', 'grazing', 'lush', 'hillside'] 03-16 15:42:44.453 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 15:42:44.453 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 15:42:45.636 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 88}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}] 2022-03-16 15:45:02,088.088 2829:trainer.py:487 do_train_dict(): eta: 19:54:40 iter: 23600 speed: 297.0 images/sec total_norm: 141.3566 (144.9451) loss: 147.9414 (147.7448) masked_loss: 1.5325 (1.5703) tag_loss: 146.0698 (146.1745) time: 1.4329 (1.7239) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4278 (1.7189) save_time: 8.8805 (25.0095) lr: 0.000064 max mem: 26307 2022-03-16 15:45:02,447.447 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.42424243688583374 2022-03-16 15:45:02,448.448 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 138.03775024414062 2022-03-16 15:45:02,448.448 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.20385904754768 2022-03-16 15:45:14,942.942 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01897869072854519 2022-03-16 15:45:14,942.942 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 15:45:14,942.942 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'the', 'motor', '##cy', '##cl', '##ist', 'is', 'happy', 'composite', 'be', 'on', 'the', 'road', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 15:45:14,958.958 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'man', 'motorcycle', 'water', 'mountain', 'road', 'bike', 'bridge', '[UNK]', 'hill', 'curb', 'bush', 'jacket', 'tire', 'ground', 'head', 'grass', 'tree', 'rock', 'wheel', 'mirror', 'line', 'pole', 'hand', 'background', 'shadow', 'windshield', 'shirt', 'hair', 'helmet', 'light', 'pipe', 'sidewalk', 'face', 'structure', 'field', 'seat', 'ocean', 'dirt', 'building', 'person', 'street', 'distance', 'barrier', 'front', 'tower', 'sunglasses', 'plate', 'box', 'leg'] 2022-03-16 15:45:30,842.842 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'line', 'water', 'road', 'light', 'field', 'ground', 'rock', 'hill', 'bridge', 'mountain', 'distance', 'tree', 'happy', 'box', 'sky', 'shirt', 'shadow', 'wheel', 'grass', 'bush', 'pole', 'jacket', 'dirt', 'bike', 'motorcycle', 'helmet', 'tire', 'curb', 'windshield'] 2022-03-16 15:47:54,396.396 2829:trainer.py:487 do_train_dict(): eta: 19:52:01 iter: 23700 speed: 297.1 images/sec total_norm: 138.1699 (141.0114) loss: 149.3079 (149.9088) masked_loss: 1.6291 (1.6486) tag_loss: 147.3706 (148.2602) time: 1.4332 (1.7231) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4279 (1.7179) save_time: 8.8805 (25.0095) lr: 0.000064 max mem: 26307 2022-03-16 15:47:54,756.756 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6764705777168274 2022-03-16 15:47:54,756.756 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 136.33474731445312 2022-03-16 15:47:54,756.756 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.21426976628665 2022-03-16 15:48:07,142.142 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018957236781716347 2022-03-16 15:48:07,143.143 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 15:48:07,143.143 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'art', '##isan', 'pizza', '[MASK]', 'sit', 'cooked', 'and', 'ready', 'to', 'eat', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 15:48:07,158.158 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['pizza', 'plate', 'table', 'cheese', 'food', '[UNK]', 'crust', 'bowl', 'slice', 'hand', 'tray', 'person', 'dish', 'onion', 'tomato', 'spoon', 'sauce', 'shrimp', 'knife', 'pepper', 'cloth', 'handle', 'napkin', 'fork', 'topping', 'finger', 'pan', 'different', 'top', 'glass', 'delicious', 'small', 'fry', 'pea', 'cup', 'large', 'bread', 'cutter', 'pie', 'shirt', 'arm', 'close', 'wooden', 'rice', 'white', 'ready', 'background', 'fries', 'french', 'ring'] 2022-03-16 15:48:22,985.985 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'hand', 'person', 'table', 'arm', 'phone', 'ready', 'sit', 'plate', 'cheese', 'pizza', 'shrimp', 'napkin', 'onion'] 2022-03-16 15:50:46,901.901 2829:trainer.py:487 do_train_dict(): eta: 19:49:23 iter: 23800 speed: 296.8 images/sec total_norm: 136.3897 (138.5133) loss: 148.4475 (151.0699) masked_loss: 1.6305 (1.6223) tag_loss: 147.2752 (149.4476) time: 1.4336 (1.7250) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4284 (1.7199) save_time: 8.8805 (25.0095) lr: 0.000064 max mem: 26307 2022-03-16 15:50:47,261.261 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5151515007019043 2022-03-16 15:50:47,262.262 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 153.40493774414062 2022-03-16 15:50:47,262.262 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.21598830282937 2022-03-16 15:50:59,841.841 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018976902589201927 2022-03-16 15:50:59,841.841 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 15:50:59,842.842 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'there', '[MASK]', 'a', 'man', '[MASK]', 'through', 'the', 'snow', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 15:50:59,857.857 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', '[UNK]', 'snow', 'ground', 'ski', 'pole', 'sunglasses', 'man', 'trunk', 'jacket', 'track', 'glove', 'hair', 'hand', 'person', 'head', 'coat', 'branch', 'face', 'glasses', 'boot', 'shadow', 'leg', 'skier', 'foot', 'sky', 'slope', 'shoe', 'stick', 'snowy', 'poles', 'hat', 'bush', 'shirt', 'hill', 'country', 'woman', 'arm', 'cross', 'flag', 'house', 'sign', 'base', 'leaf', 'skiing', 'wood', 'building', 'trail', 'path', 'green'] 2022-03-16 15:51:15,751.751 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'ground', 'hair', 'track', 'tree', 'sky', 'leg', 'snow', 'shadow', 'flag', 'coat', 'pole', 'jacket', 'glasses', 'trunk', 'ski', 'boot', 'shoe', 'knot', 'glove', 'sunglasses'] 2022-03-16 15:53:39,358.358 2829:trainer.py:487 do_train_dict(): eta: 19:46:44 iter: 23900 speed: 296.9 images/sec total_norm: 136.4052 (143.0688) loss: 149.8061 (150.1617) masked_loss: 1.6536 (1.6525) tag_loss: 148.2077 (148.5092) time: 1.4324 (1.7246) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4272 (1.7195) save_time: 8.8805 (25.0095) lr: 0.000064 max mem: 26307 2022-03-16 15:53:39,718.718 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.53125 2022-03-16 15:53:39,718.718 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 134.28103637695312 2022-03-16 15:53:39,719.719 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.2244913260142 2022-03-16 15:53:52,212.212 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018963953480124474 2022-03-16 15:53:52,212.212 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 15:53:52,213.213 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'riding', 'a', 'horse', 'in', 'a', 'field', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 15:53:52,228.228 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'grass', 'trunk', 'person', 'tail', 'horse', 'shadow', 'park', 'shirt', 'branch', 'path', '[UNK]', 'ground', 'green', 'man', 'head', 'field', 'building', 'leg', 'woman', 'leaf', 'fence', 'vest', 'road', 'rock', 'house', 'hill', 'pole', 'roof', 'helmet', 'wall', 'hat', 'jacket', 'pathway', 'post', 'saddle', 'track', 'boot', 'front', 'top', 'line', 'rider', 'sign', 'dirt', 'bench', 'couple', 'grassy', 'sky', 'area', 'large'] 2022-03-16 15:54:08,186.186 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['park', 'woman', 'field', 'person', 'tree', 'horse', 'branch', 'shirt', 'path', 'shadow', 'grass', 'tail', 'flower', 'trunk'] 2022-03-16 15:56:31,880.880 2829:trainer.py:487 do_train_dict(): eta: 19:44:05 iter: 24000 speed: 296.8 images/sec total_norm: 134.1169 (136.3411) loss: 147.1749 (148.6082) masked_loss: 1.6555 (1.6598) tag_loss: 145.7497 (146.9485) time: 1.4327 (1.7252) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4276 (1.7201) save_time: 8.8805 (25.0095) lr: 0.000064 max mem: 26307 2022-03-16 15:56:32,241.241 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.529411792755127 2022-03-16 15:56:32,241.241 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 169.04132080078125 2022-03-16 15:56:32,241.241 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.21855190482871 2022-03-16 15:56:44,962.962 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018973737955093384 2022-03-16 15:56:44,963.963 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 15:56:44,963.963 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'person', 'that', '[MASK]', '[MASK]', 'a', 'wine', 'glass', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 15:56:44,979.979 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hair', 'glass', 'woman', 'wall', 'hand', 'wine', 'chair', 'nose', 'painting', 'ear', 'eye', 'head', 'face', 'arm', 'picture', 'shirt', '[UNK]', 'box', 'table', 'bottle', 'top', 'watch', 'plate', 'strap', 'frame', 'mouth', 'man', 'tank', 'person', 'dress', 'girl', 'necklace', 'napkin', 'ring', 'wrist', 'finger', 'water', 'smile', 'bowl', 'label', 'knife', 'bracelet', 'purse', 'light', 'window', 'lid', 'restaurant', 'paper', 'curtain', 'door'] 2022-03-16 15:57:00,917.917 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'hand', 'face', 'water', 'top', 'woman', 'hair', 'person', 'table', 'wall', 'arm', 'glass', 'eye', 'chair', 'paper', 'box', 'shirt', 'label', 'picture', 'painting', 'finger', 'dress', 'nose', 'wine', 'ear', 'frame', 'bottle', 'cap', 'flower', 'glasses', 'pitcher', 'pot', 'shelf', 'rack', 'wallet', 'strap'] 2022-03-16 15:59:24,489.489 2829:trainer.py:487 do_train_dict(): eta: 19:41:27 iter: 24100 speed: 296.6 images/sec total_norm: 133.4590 (136.2929) loss: 147.5028 (147.9796) masked_loss: 1.5983 (1.6425) tag_loss: 145.7913 (146.3371) time: 1.4327 (1.7262) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4277 (1.7211) save_time: 8.8805 (25.0095) lr: 0.000064 max mem: 26307 2022-03-16 15:59:24,849.849 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.53125 2022-03-16 15:59:24,849.849 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 167.29139709472656 2022-03-16 15:59:24,850.850 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.2074146270752 2022-03-16 15:59:37,589.589 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01893622614443302 2022-03-16 15:59:37,590.590 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 15:59:37,590.590 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'man', 'laying', 'flat', 'on', 'a', 'surf', '##board', 'and', 'riding', 'a', 'wave', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 15:59:37,606.606 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['water', '[UNK]', 'wave', 'head', 'hair', 'hand', 'board', 'man', 'arm', 'foot', 'nose', 'face', 'suit', 'ear', 'surfer', 'mouth', 'sleeve', 'leg', 'logo', 'wet', 'person', 'boy', 'cord', 'eye', 'ocean', 'surf', 'watch', 'top', 'shirt', 'hat', 'dog', 'stripe', 'jacket', 'rope', 'reflection', 'shoe', 'design', 'black', 'handle', 'woman', 'short', 'strap', 'vest', 'fin', 'leash', 'girl', 'ankle', 'helmet', 'writing', 'body'] 2022-03-16 15:59:53,489.489 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'water', 'board', 'hair', 'mouth', 'eye', 'leg', 'flat', 'wave', 'nose', 'ear', 'bubble', 'surfer'] 2022-03-16 16:02:16,962.962 2829:trainer.py:487 do_train_dict(): eta: 19:38:48 iter: 24200 speed: 296.9 images/sec total_norm: 135.6889 (138.5408) loss: 148.0948 (150.4723) masked_loss: 1.5965 (1.6018) tag_loss: 146.6483 (148.8705) time: 1.4314 (1.7247) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4265 (1.7196) save_time: 8.8805 (25.0095) lr: 0.000064 max mem: 26307 2022-03-16 16:02:17,324.324 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6153846383094788 2022-03-16 16:02:17,325.325 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 141.43109130859375 2022-03-16 16:02:17,325.325 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.22078709543487 2022-03-16 16:02:29,983.983 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01891171932220459 2022-03-16 16:02:29,983.983 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 16:02:29,984.984 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'person', '[MASK]', 'on', '[MASK]', 'different', 'colored', 'fire', 'hydra', '##nts', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 16:02:29,999.999 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'shoe', 'fire', 'leg', 'ground', 'person', 'man', 'top', 'foot', 'cap', 'bolt', 'road', 'shirt', 'wall', 'sidewalk', 'line', 'hand', 'shadow', 'arm', 'red', 'photo', 'bottom', 'picture', 'knob', 'different', 'base', 'green', 'floor', 'head', 'stripe', 'next', 'jean', 'couple', 'black', 'logo', 'chain', 'lid', 'hair', 'side', 'yellow', 'street', 'image', 'white', 'woman', 'sock', 'jacket', 'number', 'face', 'close', 'wheel'] 2022-03-16 16:02:45,902.902 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'top', 'different', 'fire', 'ground', 'person', 'foot', 'leg', 'cap', 'shoe', 'sidewalk'] 2022-03-16 16:05:09,504.504 2829:trainer.py:487 do_train_dict(): eta: 19:36:09 iter: 24300 speed: 296.7 images/sec total_norm: 136.3977 (138.6405) loss: 153.6107 (152.7312) masked_loss: 1.5811 (1.5897) tag_loss: 152.2676 (151.1415) time: 1.4326 (1.7254) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4272 (1.7202) save_time: 8.8805 (25.0095) lr: 0.000063 max mem: 26307 2022-03-16 16:05:09,866.866 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6764705777168274 2022-03-16 16:05:09,866.866 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 137.2525634765625 2022-03-16 16:05:09,866.866 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.23538722366582 2022-03-16 16:05:22,628.628 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018954303115606308 2022-03-16 16:05:22,628.628 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 16:05:22,629.629 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'small', '[MASK]', 'with', 'bed', 'and', 'other', 'furniture', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 16:05:22,644.644 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['floor', 'room', 'wall', 'bed', 'leg', 'pillow', 'blanket', 'chair', 'tile', 'table', 'bag', 'cushion', '[UNK]', 'cord', 'book', 'clothes', 'bedroom', 'fan', 'paper', 'couch', 'sheet', 'magazine', 'desk', 'clothing', 'stand', 'nightstand', 'seat', 'box', 'handle', 'backpack', 'door', 'window', 'stool', 'can', 'top', 'picture', 'bottle', 'small', 'outlet', 'mattress', 'wire', 'wheel', 'towel', 'shirt', 'carpet', 'furniture', 'lamp', 'hat', 'remote', 'jacket'] 2022-03-16 16:05:38,720.720 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['other', 'small', 'room', 'book', 'floor', 'bed', 'table', 'wall', 'seat', 'base', 'magazine', 'cover', 'glass', 'chair', 'leg', 'bedroom', 'plate', 'mirror', 'fan', 'bottle', 'ceiling', 'cap', 'sheet', 'furniture', 'blanket', 'pillow', 'cord', 'outlet', 'candle', 'tile', 'rack', 'stool', 'cushion'] 2022-03-16 16:08:02,470.470 2829:trainer.py:487 do_train_dict(): eta: 19:33:31 iter: 24400 speed: 296.0 images/sec total_norm: 137.0949 (139.7642) loss: 149.7769 (151.7343) masked_loss: 1.5293 (1.5569) tag_loss: 148.3535 (150.1774) time: 1.4349 (1.7297) data: 0.0001 (0.0005) to_device: 0.0051 (0.0051) time_gpu: 1.4297 (1.7241) save_time: 8.8805 (25.0095) lr: 0.000063 max mem: 26307 2022-03-16 16:08:02,832.832 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.3636363744735718 2022-03-16 16:08:02,832.832 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 138.1591033935547 2022-03-16 16:08:02,833.833 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.24375985593213 2022-03-16 16:08:15,657.657 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018912089988589287 2022-03-16 16:08:15,658.658 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 16:08:15,658.658 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'security', 'officer', 'has', '[MASK]', 'dog', 'searching', 'luggage', '[MASK]', 'the', 'airport', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 16:08:15,673.673 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['suitcase', 'floor', 'shirt', 'man', '[UNK]', 'hair', 'dog', 'ceiling', 'airport', 'luggage', 'hand', 'sign', 'shoe', 'belt', 'light', 'head', 'pillar', 'column', 'arm', 'handle', 'wall', 'person', 'tag', 'bag', 'tile', 'suit', 'building', 'ear', 'leg', 'line', 'ground', 'cart', 'pole', 'sleeve', 'wheel', 'briefcase', 'collar', 'glass', 'case', 'woman', 'terminal', 'door', 'backpack', 'chair', 'boot', 'vest', 'glasses', 'buckle', 'room', 'leash'] 2022-03-16 16:08:31,550.550 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'line', 'light', 'woman', 'hair', 'person', 'floor', 'officer', 'security', 'airport', 'metal', 'sign', 'shirt', 'dog', 'handle', 'wheel', 'cabinet', 'belt', 'ceiling', 'column', 'panel', 'shoe', 'cart', 'pillar', 'suitcase', 'luggage', 'briefcase', 'buckle'] 2022-03-16 16:10:55,067.067 2829:trainer.py:487 do_train_dict(): eta: 19:30:52 iter: 24500 speed: 296.6 images/sec total_norm: 138.0531 (142.2440) loss: 148.4807 (148.6174) masked_loss: 1.5692 (1.5915) tag_loss: 147.0728 (147.0259) time: 1.4327 (1.7259) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4275 (1.7208) save_time: 8.8805 (25.0095) lr: 0.000063 max mem: 26307 2022-03-16 16:10:55,429.429 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5151515007019043 2022-03-16 16:10:55,430.430 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 169.04324340820312 2022-03-16 16:10:55,430.430 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.24308158130181 2022-03-16 16:11:08,381.381 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018916206434369087 2022-03-16 16:11:08,381.381 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 16:11:08,381.381 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'motor', '##cy', '##cl', '##ist', 'takes', '[MASK]', 'turn', 'in', '[MASK]', 'of', 'a', 'crowd', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 16:11:08,397.397 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['motorcycle', 'man', 'person', 'bike', 'road', 'tire', 'tree', 'building', 'jacket', 'helmet', 'hat', '[UNK]', 'curb', 'street', 'wheel', 'sidewalk', 'shirt', 'number', 'sign', 'arm', 'head', 'photo', 'short', 'pole', 'window', 'roof', 'shadow', 'trunk', 'car', 'ground', 'crowd', 'woman', 'wall', 'sunglasses', 'house', 'door', 'rock', 'hand', 'bicycle', 'black', 'cap', 'boot', 'white', 'statue', 'light', 'boy', 'group', 'shoe', 'glove', 'fence'] 2022-03-16 16:11:24,320.320 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'building', 'road', 'front', 'street', 'woman', 'short', 'car', 'person', 'turn', 'arm', 'window', 'tree', 'sign', 'shirt', 'crowd', 'roof', 'hat', 'pole', 'jacket', 'bike', 'motorcycle', 'banner', 'helmet', 'sidewalk', 'tire', 'curb'] 03-16 16:12:45.674 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 16:12:45.674 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 16:12:46.866 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 16:13:47,949.949 2829:trainer.py:487 do_train_dict(): eta: 19:28:13 iter: 24600 speed: 296.2 images/sec total_norm: 139.0278 (141.9289) loss: 152.9365 (153.4747) masked_loss: 1.6869 (1.6635) tag_loss: 151.3011 (151.8112) time: 1.4325 (1.7288) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4274 (1.7237) save_time: 8.8805 (25.0095) lr: 0.000063 max mem: 26307 2022-03-16 16:13:48,311.311 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.625 2022-03-16 16:13:48,312.312 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 165.33758544921875 2022-03-16 16:13:48,312.312 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.24130127014901 2022-03-16 16:14:01,288.288 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018921079114079475 2022-03-16 16:14:01,289.289 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 16:14:01,289.289 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'driving', '##fully', 'horse', 'pulled', 'wagon', 'with', 'a', 'cow', 'in', 'back', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 16:14:01,304.304 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['man', 'head', 'cart', 'grass', 'cow', 'wheel', 'tire', 'mountain', 'wagon', 'road', 'hat', 'rock', 'back', 'leg', 'ground', 'tail', 'shirt', 'number', 'person', '[UNK]', 'helmet', 'bull', 'animal', 'jacket', 'shadow', 'hill', 'horse', 'plate', 'harness', 'ear', 'license', 'wall', 'water', 'carriage', 'gravel', 'rope', 'truck', 'sign', 'cattle', 'horn', 'snow', 'trailer', 'drawn', 'wood', 'pole', 'bench', 'bush', 'old', 'dirt', 'tree'] 2022-03-16 16:14:17,198.198 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['back', 'head', 'man', 'number', 'water', 'road', 'ground', 'rock', 'person', 'wall', 'mountain', 'wood', 'horse', 'leg', 'bell', 'plate', 'shadow', 'wheel', 'grass', 'tail', 'hat', 'license', 'jacket', 'wagon', 'helmet', 'cart', 'cow', 'tire', 'harness', 'paw', 'puddle'] 2022-03-16 16:16:40,946.946 2829:trainer.py:487 do_train_dict(): eta: 19:25:35 iter: 24700 speed: 296.0 images/sec total_norm: 136.1129 (140.0984) loss: 148.6623 (146.9752) masked_loss: 1.4780 (1.5283) tag_loss: 146.6696 (145.4469) time: 1.4341 (1.7300) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4288 (1.7248) save_time: 8.8805 (25.0095) lr: 0.000063 max mem: 26307 2022-03-16 16:16:41,306.306 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7575757503509521 2022-03-16 16:16:41,307.307 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 146.0332489013672 2022-03-16 16:16:41,307.307 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.24007534211681 2022-03-16 16:16:54,178.178 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019005320966243744 2022-03-16 16:16:54,178.178 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 16:16:54,179.179 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'an', 'orange', 'piece', 'yellowstone', 'luggage', 'sitting', 'next', 'to', '[MASK]', 'light', 'pole', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 16:16:54,194.194 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['building', 'sky', 'window', 'cloud', 'lamp', 'light', 'tree', 'sign', 'street', 'pole', 'sidewalk', 'roof', '[UNK]', 'post', 'suitcase', 'person', 'bag', 'shadow', 'city', 'road', 'wall', 'ground', 'fence', 'man', 'luggage', 'shirt', 'handle', 'plant', 'car', 'door', 'woman', 'hair', 'fire', 'box', 'sunglasses', 'orange', 'hand', 'chimney', 'base', 'jacket', 'flower', 'can', 'next', 'grass', 'jean', 'arm', 'yellow', 'head', 'brick', 'backpack'] 2022-03-16 16:17:10,123.123 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['next', 'building', 'street', 'light', 'ground', 'post', 'person', 'wall', 'plant', 'window', 'tree', 'piece', 'sign', 'sky', 'roof', 'orange', 'cloud', 'pole', 'lamp', 'sidewalk', 'luggage'] 2022-03-16 16:19:34,007.007 2829:trainer.py:487 do_train_dict(): eta: 19:22:56 iter: 24800 speed: 295.9 images/sec total_norm: 135.8421 (138.9159) loss: 147.8478 (150.6669) masked_loss: 1.6022 (1.6336) tag_loss: 146.6045 (149.0333) time: 1.4335 (1.7306) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4284 (1.7254) save_time: 8.8805 (25.0095) lr: 0.000063 max mem: 26307 2022-03-16 16:19:34,368.368 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6764705777168274 2022-03-16 16:19:34,369.369 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 147.7824249267578 2022-03-16 16:19:34,369.369 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.2547194239605 2022-03-16 16:19:47,437.437 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01897704228758812 2022-03-16 16:19:47,437.437 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 16:19:47,437.437 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'officers', 'are', 'standing', 'and', 'on', '##₇', 'to', 'form', 'a', 'line', 'across', 'a', 'town', '[MASK]', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 16:19:47,453.453 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'horse', 'person', 'man', 'building', 'pole', 'helmet', 'jacket', 'street', '[UNK]', 'sidewalk', 'bus', 'city', 'hat', 'child', 'sky', 'light', 'sign', 'line', 'ground', 'uniform', 'car', 'road', 'girl', 'boot', 'leg', 'bag', 'jean', 'boy', 'brick', 'policeman', 'group', 'shoe', 'police', 'woman', 'head', 'window', 'flag', 'vest', 'parade', 'coat', 'traffic', 'backpack', 'wall', 'post', 'officer', 'shirt', 'face', 'cover', 'tire'] 2022-03-16 16:20:03,499.499 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'city', 'head', 'man', 'town', 'line', 'building', 'street', 'center', 'person', 'officer', 'van', 'window', 'tree', 'horse', 'sign', 'shirt', 'bus', 'flag', 'brick', 'hat', 'pole', 'jacket', 'parade', 'helmet', 'backpack'] 2022-03-16 16:22:27,091.091 2829:trainer.py:487 do_train_dict(): eta: 19:20:18 iter: 24900 speed: 295.8 images/sec total_norm: 138.9472 (141.9700) loss: 148.8446 (149.5296) masked_loss: 1.6578 (1.6754) tag_loss: 147.0405 (147.8542) time: 1.4338 (1.7309) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4288 (1.7258) save_time: 8.8805 (25.0095) lr: 0.000063 max mem: 26307 2022-03-16 16:22:27,453.453 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.625 2022-03-16 16:22:27,453.453 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 131.02325439453125 2022-03-16 16:22:27,454.454 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.26436616516114 2022-03-16 16:22:40,550.550 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01898140087723732 2022-03-16 16:22:40,550.550 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 16:22:40,551.551 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'fig', '##uri', '##nes', 'of', 'humans', 'and', 'animals', 'dressed', 'as', 'humans', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 16:22:40,566.566 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['window', 'sign', 'doll', 'bear', 'hat', 'letter', 'chair', 'head', 'fence', 'cat', 'word', '[UNK]', 'animal', 'reflection', 'table', 'curtain', 'glass', 'display', 'shirt', 'dog', 'ear', 'hair', 'cage', 'man', 'pole', 'skull', 'umbrella', 'stuffed', 'toy', 'wood', 'wall', 'teddy', 'statue', 'paper', 'paw', 'person', 'cloth', 'frame', 'arm', 'clothes', 'logo', 'dress', 'leg', 'face', 'top', 'nose', 'fur', 'woman', 'flag', 'jacket'] 2022-03-16 16:22:56,405.405 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hair', 'word', 'table', 'glass', 'window', 'letter', 'sign', 'dog', 'animal', 'dress', 'bear', 'cat', 'hat', 'cap', 'skull', 'cage', 'fence', 'toy', 'reflection', 'doll', 'kitten'] 2022-03-16 16:25:20,265.265 2829:trainer.py:487 do_train_dict(): eta: 19:17:39 iter: 25000 speed: 295.7 images/sec total_norm: 135.7363 (140.2340) loss: 148.0405 (148.2562) masked_loss: 1.6266 (1.6477) tag_loss: 146.2447 (146.6085) time: 1.4332 (1.7317) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4281 (1.7266) save_time: 8.8805 (25.0095) lr: 0.000062 max mem: 26307 2022-03-16 16:25:20,267.267 2829:checkpoint.py:222 save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt 2022-03-16 16:25:29,357.357 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6666666865348816 2022-03-16 16:25:29,358.358 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 158.1492156982422 2022-03-16 16:25:29,358.358 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.27328960829047 2022-03-16 16:25:42,439.439 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01900436170399189 2022-03-16 16:25:42,440.440 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 16:25:42,440.440 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'yellow', 'trolley', 'passing', 'by', 'street', 'intersection', '[MASK]', 'night', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 16:25:42,455.455 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['pole', 'street', 'light', 'window', 'sidewalk', 'road', 'sign', 'car', 'bus', 'traffic', 'curb', 'line', 'door', 'building', '[UNK]', 'poster', 'wall', 'tree', 'night', 'graffiti', 'train', 'pillar', 'stop', 'wheel', 'man', 'person', 'picture', 'city', 'arrow', 'tire', 'fire', 'trolley', 'column', 'base', 'wire', 'ceiling', 'van', 'sky', 'front', 'intersection', 'bridge', 'windshield', 'ground', 'post', 'booth', 'back', 'corner', 'letter', 'box', 'shirt'] 2022-03-16 16:25:58,226.226 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['can', 'line', 'night', 'building', 'door', 'road', 'street', 'light', 'car', 'ground', 'person', 'wall', 'van', 'window', 'train', 'sign', 'yellow', 'bus', 'traffic', 'passing', 'ceiling', 'pole', 'intersection', 'sidewalk', 'trolley'] 2022-03-16 16:28:21,141.141 2829:trainer.py:487 do_train_dict(): eta: 19:15:13 iter: 25100 speed: 283.1 images/sec total_norm: 139.7350 (140.6100) loss: 149.4645 (151.5770) masked_loss: 1.5557 (1.5728) tag_loss: 148.0794 (150.0042) time: 1.4327 (1.8087) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4275 (1.7163) save_time: 8.8805 (21.7526) lr: 0.000062 max mem: 26307 2022-03-16 16:28:21,503.503 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-16 16:28:21,503.503 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 167.398681640625 2022-03-16 16:28:21,503.503 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.2692554564703 2022-03-16 16:28:34,490.490 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018990544602274895 2022-03-16 16:28:34,490.490 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 16:28:34,490.490 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'elephants', 'partially', '[MASK]', 'in', 'a', 'body', 'of', 'water', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 16:28:34,506.506 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['elephant', 'water', 'trunk', 'grass', 'ear', '[UNK]', 'sky', 'head', 'leg', 'eye', 'body', 'ripple', 'name', 'skin', 'tail', 'mouth', 'tree', 'shore', 'foot', 'face', 'back', 'river', 'writing', 'logo', 'land', 'reflection', 'background', 'branch', 'next', 'large', 'other', 'splash', 'shadow', 'bank', 'field', 'wave', 'couple', 'bird', 'baby', 'hair', 'horn', 'ground', 'hole', 'plant', 'rock', 'standing', 'small', 'line', 'bush', 'number'] 2022-03-16 16:28:50,422.422 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'name', 'water', 'body', 'skin', 'eye', 'sky', 'leg', 'wave', 'ear', 'grass', 'logo', 'trunk', 'elephant', 'ripple'] 2022-03-16 16:31:15,575.575 2829:trainer.py:487 do_train_dict(): eta: 19:12:36 iter: 25200 speed: 293.5 images/sec total_norm: 137.7838 (139.7146) loss: 146.6917 (148.7538) masked_loss: 1.5849 (1.6268) tag_loss: 145.3175 (147.1270) time: 1.4328 (1.7443) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4276 (1.7392) save_time: 8.8805 (21.7526) lr: 0.000062 max mem: 26307 2022-03-16 16:31:15,936.936 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6470588445663452 2022-03-16 16:31:15,936.936 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 132.5997772216797 2022-03-16 16:31:15,936.936 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.2923086565945 2022-03-16 16:31:29,139.139 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01899297535419464 2022-03-16 16:31:29,140.140 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 16:31:29,140.140 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'people', 'riding', '[MASK]', 'down', '[MASK]', 'wooded', 'trail', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 16:31:29,156.156 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['head', 'helmet', 'boot', 'horse', 'jacket', '[UNK]', 'tree', 'face', 'foot', 'man', 'leg', 'vest', 'ground', 'person', 'trail', 'branch', 'path', 'bush', 'nose', 'grass', 'ear', 'woman', 'forest', 'hat', 'glove', 'dirt', 'saddle', 'eye', 'road', 'shirt', 'mane', 'patch', 'stripe', 'girl', 'hair', 'brown', 'scarf', 'coat', 'glasses', 'hand', 'sky', 'jean', 'wood', 'boy', 'horseback', 'chain', 'rider', 'tail', 'rock', 'plant'] 2022-03-16 16:31:45,076.076 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'face', 'woman', 'ground', 'person', 'forest', 'eye', 'foot', 'tree', 'horse', 'path', 'leg', 'trail', 'nose', 'grass', 'bush', 'hat', 'jacket', 'glasses', 'boot', 'helmet', 'saddle', 'wooded', 'vest', 'mane'] 2022-03-16 16:34:08,874.874 2829:trainer.py:487 do_train_dict(): eta: 19:09:57 iter: 25300 speed: 295.4 images/sec total_norm: 135.1420 (140.3543) loss: 150.6504 (150.7531) masked_loss: 1.6197 (1.6218) tag_loss: 148.7565 (149.1314) time: 1.4338 (1.7330) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4288 (1.7278) save_time: 8.8805 (21.7526) lr: 0.000062 max mem: 26307 2022-03-16 16:34:09,235.235 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6571428775787354 2022-03-16 16:34:09,236.236 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 148.993896484375 2022-03-16 16:34:09,236.236 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.29301742493637 2022-03-16 16:34:22,331.331 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.018994001671671867 2022-03-16 16:34:22,331.331 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 16:34:22,332.332 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'white', 'cow', 'standing', 'on', 'a', 'sandy', '[MASK]', '##ener', 'boats', 'in', 'the', 'background', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 16:34:22,347.347 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['ear', 'sky', 'eye', 'boat', 'sand', 'nose', 'head', 'tree', 'leg', 'beach', 'cow', 'person', 'water', 'shadow', 'mountain', 'rock', 'rope', 'hill', 'face', 'ocean', '[UNK]', 'ground', 'tail', 'building', 'body', 'horn', 'man', 'mouth', 'cloud', 'animal', 'hair', 'sandy', 'background', 'neck', 'wave', 'house', 'shore', 'bird', 'collar', 'child', 'distance', 'couple', 'woman', 'string', 'palm', 'shirt', 'roof', 'footprint', 'umbrella', 'short'] 2022-03-16 16:34:38,282.282 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'water', 'white', 'rock', 'person', 'child', 'wall', 'mountain', 'eye', 'tree', 'beach', 'sky', 'boat', 'ocean', 'leg', 'background', 'wave', 'nose', 'ear', 'shadow', 'flag', 'palm', 'sand', 'tail', 'cloud', 'sandy', 'rope', 'cow'] 2022-03-16 16:37:02,282.282 2829:trainer.py:487 do_train_dict(): eta: 19:07:18 iter: 25400 speed: 295.3 images/sec total_norm: 134.5178 (136.1719) loss: 145.9712 (147.0637) masked_loss: 1.5456 (1.6194) tag_loss: 144.7548 (145.4443) time: 1.4349 (1.7341) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4296 (1.7289) save_time: 8.8805 (21.7526) lr: 0.000062 max mem: 26307 2022-03-16 16:37:02,642.642 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6666666865348816 2022-03-16 16:37:02,643.643 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 149.9478302001953 2022-03-16 16:37:02,643.643 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.30016636567957 2022-03-16 16:37:15,837.837 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01898724026978016 2022-03-16 16:37:15,837.837 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 16:37:15,838.838 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'barefoot', 'man', 'sitting', 'on', 'one', 'of', '[MASK]', 'red', 'chairs', 'reading', 'his', 'phone', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 16:37:15,853.853 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['man', 'shirt', 'ground', 'jean', 'hair', 'glasses', 'hand', 'chair', 'shoe', 'phone', 'leg', 'head', '[UNK]', 'shadow', 'table', 'person', 'ear', 'sidewalk', 'face', 'stool', 'flower', 'cell', 'wall', 'bag', 'foot', 'sock', 'watch', 'stand', 'bench', 'arm', 'sunglasses', 'woman', 'jacket', 'logo', 'short', 'pole', 'backpack', 'top', 'sign', 'stripe', 'dirt', 'number', 'cup', 'next', 'writing', 'window', 'nose', 'letter', 'bottle', 'tie'] 2022-03-16 16:37:31,797.797 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'several', 'red', 'ground', 'hair', 'person', 'table', 'phone', 'chair', 'foot', 'jean', 'shirt', 'leg', 'bag', 'ear', 'shadow', 'glasses', 'shoe', 'stool', 'sunglasses'] 2022-03-16 16:39:55,773.773 2829:trainer.py:487 do_train_dict(): eta: 19:04:40 iter: 25500 speed: 295.1 images/sec total_norm: 137.1805 (140.4316) loss: 148.2104 (147.2036) masked_loss: 1.5100 (1.5815) tag_loss: 145.9834 (145.6221) time: 1.4338 (1.7349) data: 0.0001 (0.0005) to_device: 0.0051 (0.0050) time_gpu: 1.4285 (1.7294) save_time: 8.8805 (21.7526) lr: 0.000062 max mem: 26307 2022-03-16 16:39:56,136.136 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4000000059604645 2022-03-16 16:39:56,136.136 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 141.21771240234375 2022-03-16 16:39:56,137.137 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.3150619417429 2022-03-16 16:40:09,359.359 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01905658282339573 2022-03-16 16:40:09,359.359 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 16:40:09,359.359 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'doll', 'with', '[MASK]', 'large', 'head', 'next', 'to', 'a', 'banana', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 16:40:09,375.375 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['paper', 'desk', 'book', 'table', 'pen', '[UNK]', 'mouse', 'light', 'keyboard', 'computer', 'wall', 'cord', 'button', 'cup', 'ceiling', 'monitor', 'floor', 'wire', 'screen', 'bag', 'box', 'office', 'phone', 'handle', 'key', 'top', 'chair', 'room', 'pile', 'pad', 'laptop', 'container', 'shelf', 'reflection', 'pencil', 'label', 'umbrella', 'picture', 'window', 'man', 'magazine', 'folder', 'logo', 'cell', 'black', 'cap', 'open', 'next', 'notebook', 'white'] 2022-03-16 16:40:25,230.230 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'face', 'large', 'hair', 'mouth', 'floor', 'table', 'wall', 'arm', 'eye', 'paper', 'clothes', 'dress', 'flower', 'bow', 'wire', 'doll', 'shoe', 'cord', 'banana', 'sock'] 03-16 16:42:46.936 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 16:42:46.937 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 16:42:48.291 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 16:42:49,334.334 2829:trainer.py:487 do_train_dict(): eta: 19:02:01 iter: 25600 speed: 295.0 images/sec total_norm: 139.8911 (142.9466) loss: 146.4619 (148.1393) masked_loss: 1.5929 (1.6069) tag_loss: 144.8118 (146.5324) time: 1.4344 (1.7356) data: 0.0001 (0.0001) to_device: 0.0052 (0.0051) time_gpu: 1.4292 (1.7304) save_time: 8.8805 (21.7526) lr: 0.000061 max mem: 26307 2022-03-16 16:42:49,696.696 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6000000238418579 2022-03-16 16:42:49,696.696 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 134.64019775390625 2022-03-16 16:42:49,696.696 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.32209862438157 2022-03-16 16:43:02,977.977 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019058486446738243 2022-03-16 16:43:02,977.977 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 16:43:02,978.978 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'guy', 'is', 'taking', 'a', 'picture', 'of', 'himself', 'holding', 'a', 'phone', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 16:43:02,993.993 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['man', 'hat', 'head', 'hand', 'logo', 'building', 'grass', 'person', '[UNK]', 'phone', 'screen', 'jacket', 'camera', 'window', 'mouth', 'face', 'wall', 'picture', 'cap', 'television', 'glasses', 'reflection', 'shirt', 'speaker', 'cell', 'light', 'nose', 'table', 'pole', 'coat', 'handle', 'front', 'button', 'sunglasses', 'sign', 'top', 'ceiling', 'tie', 'woman', 'arm', 'sky', 'black', 'chair', 'bag', 'suit', 'tree', 'ground', 'scarf', 'book', 'finger'] 2022-03-16 16:43:18,930.930 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'hand', 'building', 'light', 'cup', 'mouth', 'wall', 'phone', 'guy', 'chair', 'window', 'sign', 'picture', 'nose', 'camera', 'coat', 'grass', 'hat', 'cap', 'jacket', 'glasses', 'logo', 'sunglasses'] 2022-03-16 16:45:42,649.649 2829:trainer.py:487 do_train_dict(): eta: 18:59:22 iter: 25700 speed: 295.4 images/sec total_norm: 137.5378 (141.7697) loss: 144.8826 (148.2045) masked_loss: 1.6210 (1.6339) tag_loss: 143.1542 (146.5706) time: 1.4324 (1.7332) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4272 (1.7280) save_time: 8.8805 (21.7526) lr: 0.000061 max mem: 26307 2022-03-16 16:45:43,010.010 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5 2022-03-16 16:45:43,010.010 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 146.68502807617188 2022-03-16 16:45:43,010.010 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.3220149639041 2022-03-16 16:45:57,129.129 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019044969230890274 2022-03-16 16:45:57,129.129 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 16:45:57,129.129 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'person', 'taking', 'a', 'picture', 'of', 'a', '[MASK]', 'vanity', '[MASK]', 'sink', 'in', 'a', '[MASK]', 'mirror', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 16:45:57,144.144 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'wall', 'mirror', 'towel', 'sink', 'bathroom', 'shirt', 'floor', 'bottle', 'man', 'person', 'door', 'arm', 'hand', 'tray', 'toilet', 'head', 'light', 'rack', 'tile', 'reflection', 'handle', 'hair', 'soap', 'short', 'tissue', 'shelf', 'drain', 'dish', 'rug', 'tank', 'box', 'woman', 'paper', 'counter', 'tub', 'top', 'outlet', 'glass', 'board', 'leg', 'cup', 'large', 'picture', 'vanity', 'sign', 'plate', 'napkin', 'pipe', 'shower'] 2022-03-16 16:46:13,167.167 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'hand', 'large', 'door', 'woman', 'person', 'floor', 'wall', 'arm', 'watch', 'box', 'shirt', 'picture', 'camera', 'mirror', 'bathroom', 'bottle', 'sink', 'purse', 'towel', 'toilet', 'outlet', 'tile', 'vanity', 'vent'] 2022-03-16 16:48:36,777.777 2829:trainer.py:487 do_train_dict(): eta: 18:56:43 iter: 25800 speed: 294.0 images/sec total_norm: 135.0892 (137.9039) loss: 147.6626 (148.3559) masked_loss: 1.5781 (1.5925) tag_loss: 145.9246 (146.7634) time: 1.4326 (1.7413) data: 0.0001 (0.0002) to_device: 0.0052 (0.0051) time_gpu: 1.4275 (1.7360) save_time: 8.8805 (21.7526) lr: 0.000061 max mem: 26307 2022-03-16 16:48:37,139.139 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5757575631141663 2022-03-16 16:48:37,139.139 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 144.58230590820312 2022-03-16 16:48:37,139.139 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.33457571199041 2022-03-16 16:48:50,509.509 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019095269963145256 2022-03-16 16:48:50,509.509 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 16:48:50,510.510 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'holding', 'a', 'tennis', 'ra', '##c', '##quet', 'on', 'a', 'court', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 16:48:50,525.525 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['leg', '[UNK]', 'line', 'short', 'sock', 'hand', 'court', 'man', 'shoe', 'shirt', 'tennis', 'arm', 'hair', 'ball', 'band', 'head', 'player', 'handle', 'ground', 'shadow', 'logo', 'foot', 'wrist', 'face', 'male', 'knee', 'stripe', 'sleeve', 'blue', 'person', 'glove', 'ankle', 'orange', 'green', 'yellow', 'grass', 'watch', 'ready', 'ear', 'match', 'young', 'serve', 'action', 'collar', 'swing', 'bat', 'calf', 'black', 'hat', 'string'] 2022-03-16 16:49:06,351.351 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'line', 'band', 'court', 'short', 'hair', 'arm', 'ball', 'shirt', 'leg', 'handle', 'tennis', 'shadow', 'shoe', 'sock'] 2022-03-16 16:51:30,243.243 2829:trainer.py:487 do_train_dict(): eta: 18:54:04 iter: 25900 speed: 295.2 images/sec total_norm: 137.7176 (139.7167) loss: 147.9422 (148.7543) masked_loss: 1.6580 (1.6801) tag_loss: 146.2596 (147.0742) time: 1.4331 (1.7346) data: 0.0001 (0.0001) to_device: 0.0052 (0.0051) time_gpu: 1.4277 (1.7294) save_time: 8.8805 (21.7526) lr: 0.000061 max mem: 26307 2022-03-16 16:51:30,605.605 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5625 2022-03-16 16:51:30,606.606 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 131.23956298828125 2022-03-16 16:51:30,606.606 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.36111357762263 2022-03-16 16:51:44,148.148 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019090967252850533 2022-03-16 16:51:44,148.148 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 16:51:44,149.149 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'airplanes', 'flying', 'in', 'a', '[MASK]', 'in', 'the', 'sky', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 16:51:44,164.164 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['airplane', 'sky', 'smoke', 'wing', 'trail', 'jet', 'tail', 'cloud', 'formation', 'blue', '[UNK]', 'line', 'group', 'nose', 'stream', 'air', 'aircraft', 'fighter', 'body', 'tree', 'plane', 'high', 'overhead', 'small', 'stripe', 'cockpit', 'front', 'squadron', 'engine', 'red', 'view', 'white', 'back', 'fuselage', 'day', 'other', 'vapor', 'top', 'light', 'clear', 'large', 'tank', 'writing', 'couple', 'window', 'letter', 'logo', 'fin', 'clouds', 'different'] 2022-03-16 16:52:00,083.083 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'wing', 'sky', 'formation', 'trail', 'smoke', 'tail', 'jet', 'airplane'] 2022-03-16 16:54:23,962.962 2829:trainer.py:487 do_train_dict(): eta: 18:51:25 iter: 26000 speed: 294.7 images/sec total_norm: 138.6972 (142.5949) loss: 148.2448 (149.9707) masked_loss: 1.6097 (1.6407) tag_loss: 146.6478 (148.3300) time: 1.4338 (1.7372) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4286 (1.7321) save_time: 8.8805 (21.7526) lr: 0.000061 max mem: 26307 2022-03-16 16:54:24,323.323 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6060606241226196 2022-03-16 16:54:24,324.324 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 119.13093566894531 2022-03-16 16:54:24,324.324 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.37203218014304 2022-03-16 16:54:37,998.998 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019138723611831665 2022-03-16 16:54:37,998.998 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 16:54:37,998.998 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'zebra', 'with', 'brown', '[MASK]', 'is', 'grazing', 'in', 'a', '[MASK]', 'field', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 16:54:38,014.014 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['leg', 'field', 'tail', 'grass', 'zebra', 'head', 'sky', 'ear', 'bush', 'stripe', 'background', 'tree', 'mane', 'hill', 'mountain', '[UNK]', 'eye', 'leaf', 'neck', 'flower', 'nose', 'mouth', 'hair', 'animal', 'ground', 'green', 'face', 'open', 'dirt', 'grassy', 'horn', 'rock', 'back', 'grazing', 'next', 'cow', 'trunk', 'lush', 'other', 'standing', 'deer', 'area', 'distance', 'body', 'cloud', 'top', 'foot', 'large', 'wild', 'young'] 2022-03-16 16:54:53,959.959 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'face', 'field', 'ground', 'hair', 'green', 'mouth', 'brown', 'hill', 'mountain', 'eye', 'neck', 'tree', 'sky', 'leg', 'background', 'nose', 'ear', 'grass', 'tail', 'bush', 'flower', 'leaf', 'stripe', 'mane', 'zebra'] 2022-03-16 16:57:17,605.605 2829:trainer.py:487 do_train_dict(): eta: 18:48:46 iter: 26100 speed: 294.9 images/sec total_norm: 140.6286 (143.1858) loss: 149.3145 (148.3190) masked_loss: 1.6170 (1.6073) tag_loss: 147.3820 (146.7116) time: 1.4340 (1.7365) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4289 (1.7314) save_time: 8.8805 (21.7526) lr: 0.000061 max mem: 26307 2022-03-16 16:57:17,968.968 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5454545617103577 2022-03-16 16:57:17,968.968 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 161.69021606445312 2022-03-16 16:57:17,968.968 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.36007316239917 2022-03-16 16:57:31,614.614 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01917995512485504 2022-03-16 16:57:31,614.614 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 16:57:31,614.614 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'on', 'a', 'red', 'motorcycle', 'during', 'a', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 16:57:31,630.630 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['motorcycle', 'man', 'tire', 'bike', 'helmet', 'tree', 'road', 'bush', 'wall', 'glove', 'wheel', 'person', '[UNK]', 'grass', 'boot', 'leg', 'jacket', 'ground', 'number', 'fence', 'arm', 'dirt', 'curb', 'suit', 'stripe', 'pole', 'post', 'line', 'hand', 'windshield', 'shoe', 'shirt', 'logo', 'foot', 'leaf', 'rider', 'head', 'background', 'sign', 'hedge', 'sidewalk', 'pipe', 'plate', 'light', 'fender', 'street', 'back', 'track', 'red', 'mirror'] 2022-03-16 16:57:47,568.568 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'number', 'line', 'road', 'red', 'rock', 'race', 'person', 'tree', 'background', 'chain', 'truck', 'suit', 'flag', 'wheel', 'bush', 'bottle', 'bike', 'motorcycle', 'boot', 'sleeve', 'helmet', 'cart', 'tire', 'curb', 'glove'] 2022-03-16 17:00:11,350.350 2829:trainer.py:487 do_train_dict(): eta: 18:46:07 iter: 26200 speed: 294.7 images/sec total_norm: 138.7981 (140.1107) loss: 149.5854 (149.4349) masked_loss: 1.5622 (1.5844) tag_loss: 147.8237 (147.8505) time: 1.4347 (1.7374) data: 0.0001 (0.0002) to_device: 0.0050 (0.0050) time_gpu: 1.4296 (1.7323) save_time: 8.8805 (21.7526) lr: 0.000061 max mem: 26307 2022-03-16 17:00:11,711.711 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6000000238418579 2022-03-16 17:00:11,711.711 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 138.23646545410156 2022-03-16 17:00:11,711.711 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.37585929319432 2022-03-16 17:00:25,454.454 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019185159355401993 2022-03-16 17:00:25,454.454 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 17:00:25,455.455 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'group', '[MASK]', 'children', 'standing', 'around', 'a', 'candle', 'filled', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 17:00:25,470.470 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['cake', 'shirt', 'candle', 'hair', 'child', 'birthday', 'boy', 'girl', 'table', 'head', 'hand', 'person', 'tray', '[UNK]', 'eye', 'cup', 'plate', 'woman', 'face', 'flame', 'crown', 'hat', 'nose', 'little', 'picture', 'box', 'writing', 'kid', 'chair', 'wall', 'young', 'front', 'sweater', 'arm', 'floor', 'paper', 'man', 'necklace', 'bench', 'container', 'cardboard', 'dinosaur', 'letter', 'word', 'handle', 'ball', 'window', 'small', 'cream', 'group'] 2022-03-16 17:00:41,423.423 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'group', 'hand', 'woman', 'cup', 'hair', 'girl', 'person', 'child', 'table', 'boy', 'writing', 'eye', 'shirt', 'crown', 'nose', 'plate', 'bench', 'cake', 'tray', 'candle', 'sweater', 'dinosaur'] 2022-03-16 17:03:05,020.020 2829:trainer.py:487 do_train_dict(): eta: 18:43:27 iter: 26300 speed: 294.8 images/sec total_norm: 137.6970 (139.7932) loss: 152.1063 (149.5219) masked_loss: 1.5213 (1.5976) tag_loss: 150.5312 (147.9243) time: 1.4333 (1.7366) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4281 (1.7314) save_time: 8.8805 (21.7526) lr: 0.000060 max mem: 26307 2022-03-16 17:03:05,381.381 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5714285969734192 2022-03-16 17:03:05,382.382 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 123.57514953613281 2022-03-16 17:03:05,382.382 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.38732507012107 2022-03-16 17:03:19,136.136 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019203023985028267 2022-03-16 17:03:19,137.137 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 17:03:19,137.137 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'with', 'an', 'orange', 'shirt', 'hu', '##rl', '##s', '[MASK]', 'fr', '[MASK]', '##bee', 'in', '[MASK]', 'of', 'him', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 17:03:19,152.152 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'short', 'tree', 'ground', 'shoe', 'man', 'shadow', 'arm', 'bush', '[UNK]', 'grass', 'head', 'leg', 'hand', 'sock', 'hair', 'trunk', 'plant', 'hat', 'boy', 'person', 'sky', 'wood', 'dirt', 'ear', 'game', 'young', 'path', 'cap', 'watch', 'pole', 'face', 'field', 'glasses', 'foot', 'park', 'sunglasses', 'disc', 'logo', 'orange', 'wrist', 'playing', 'stump', 'leaf', 'area', 'woman', 'trail', 'air', 'cloud', 'stripe'] 2022-03-16 17:03:35,100.100 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'front', 'short', 'ground', 'hair', 'arm', 'tree', 'wood', 'shirt', 'orange', 'shadow', 'grass', 'bush', 'dirt', 'glasses', 'shoe', 'stump', 'sock'] 2022-03-16 17:05:58,746.746 2829:trainer.py:487 do_train_dict(): eta: 18:40:48 iter: 26400 speed: 294.7 images/sec total_norm: 140.1110 (145.5084) loss: 147.5887 (148.3997) masked_loss: 1.6384 (1.6942) tag_loss: 145.1637 (146.7055) time: 1.4338 (1.7373) data: 0.0001 (0.0001) to_device: 0.0052 (0.0050) time_gpu: 1.4285 (1.7321) save_time: 8.8805 (21.7526) lr: 0.000060 max mem: 26307 2022-03-16 17:05:59,109.109 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6666666865348816 2022-03-16 17:05:59,109.109 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 127.11598205566406 2022-03-16 17:05:59,109.109 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.39912128088609 2022-03-16 17:06:12,785.785 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019192570820450783 2022-03-16 17:06:12,785.785 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 17:06:12,786.786 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'lady', 'with', 'a', 'red', '[MASK]', ',', '##⇌', 'pants', ',', 'and', 'a', 'clear', 'umbrella', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 17:06:12,801.801 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['umbrella', '[UNK]', 'sidewalk', 'window', 'person', 'bus', 'jacket', 'pole', 'road', 'ground', 'door', 'street', 'shoe', 'wall', 'line', 'tree', 'tire', 'woman', 'wheel', 'curb', 'sky', 'bush', 'sign', 'coat', 'building', 'hand', 'reflection', 'car', 'leg', 'fence', 'light', 'bag', 'stripe', 'fire', 'rain', 'purse', 'jean', 'man', 'handle', 'plant', 'boy', 'red', 'child', 'rainy', 'background', 'wet', 'water', 'hair', 'foot', 'grass'] 2022-03-16 17:06:28,691.691 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'line', 'black', 'building', 'door', 'road', 'street', 'red', 'light', 'woman', 'ground', 'person', 'wall', 'clear', 'lady', 'plant', 'window', 'tree', 'sign', 'sky', 'bus', 'leg', 'bag', 'coat', 'bush', 'pole', 'jacket', 'fence', 'reflection', 'shoe', 'sidewalk', 'tire', 'umbrella', 'curb', 'strap'] 2022-03-16 17:08:52,610.610 2829:trainer.py:487 do_train_dict(): eta: 18:38:09 iter: 26500 speed: 294.5 images/sec total_norm: 142.1496 (144.8692) loss: 145.1857 (146.8584) masked_loss: 1.5447 (1.5827) tag_loss: 143.3026 (145.2757) time: 1.4334 (1.7386) data: 0.0001 (0.0001) to_device: 0.0052 (0.0051) time_gpu: 1.4283 (1.7334) save_time: 8.8805 (21.7526) lr: 0.000060 max mem: 26307 2022-03-16 17:08:52,971.971 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6129032373428345 2022-03-16 17:08:52,971.971 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 142.62860107421875 2022-03-16 17:08:52,971.971 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.39220493001149 2022-03-16 17:09:06,768.768 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01921571046113968 2022-03-16 17:09:06,769.769 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 17:09:06,769.769 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'dog', 'is', 'on', 'a', 'leash', 'outside', 'of', 'church', 'doors', '[MASK]', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 17:09:06,784.784 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'door', 'building', 'window', 'brick', '[UNK]', 'floor', 'handle', 'sign', 'light', 'glass', 'paper', 'head', 'cat', 'frame', 'hand', 'number', 'tile', 'box', 'pipe', 'pole', 'letter', 'face', 'step', 'ceiling', 'tree', 'top', 'stone', 'flower', 'ear', 'leg', 'plant', 'panel', 'ground', 'word', 'reflection', 'room', 'shelf', 'white', 'front', 'bear', 'picture', 'large', 'clock', 'block', 'ledge', 'doorway', 'curtain', 'phone', 'shutter'] 2022-03-16 17:09:22,667.667 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'church', 'building', 'door', 'ground', 'outside', 'wall', 'paper', 'window', 'sign', 'dog', 'pole', 'doorway', 'arch', 'collar', 'sidewalk', 'leash'] 2022-03-16 17:11:46,492.492 2829:trainer.py:487 do_train_dict(): eta: 18:35:29 iter: 26600 speed: 294.5 images/sec total_norm: 138.2753 (141.2103) loss: 149.3650 (152.5968) masked_loss: 1.5621 (1.6054) tag_loss: 147.9222 (150.9914) time: 1.4344 (1.7388) data: 0.0001 (0.0005) to_device: 0.0052 (0.0051) time_gpu: 1.4290 (1.7332) save_time: 8.8805 (21.7526) lr: 0.000060 max mem: 26307 2022-03-16 17:11:46,854.854 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-16 17:11:46,855.855 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 172.50955200195312 2022-03-16 17:11:46,855.855 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.39589138245314 2022-03-16 17:12:00,728.728 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01919550448656082 2022-03-16 17:12:00,728.728 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 17:12:00,729.729 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'person', 'who', 'is', 'standing', 'on', 'the', '[MASK]', 'and', 'flying', 'a', 'kite', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 17:12:00,744.744 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'kite', '[UNK]', 'leg', 'water', 'string', 'person', 'hair', 'woman', 'handle', 'man', 'hand', 'wave', 'beach', 'head', 'shirt', 'foot', 'ocean', 'rope', 'board', 'girl', 'short', 'suit', 'mountain', 'top', 'line', 'arm', 'grass', 'rock', 'jean', 'shadow', 'sand', 'tree', 'shoe', 'air', 'belt', 'sail', 'cloud', 'parachute', 'surfer', 'shore', 'horizon', 'wet', 'building', 'pole', 'distance', 'wake', 'flag', 'boot', 'boat'] 2022-03-16 17:12:16,710.710 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'line', 'water', 'top', 'woman', 'short', 'rock', 'board', 'hair', 'person', 'foot', 'beach', 'sky', 'shirt', 'ocean', 'leg', 'wave', 'string', 'horizon', 'kite'] 03-16 17:12:48.353 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 17:12:48.353 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 17:12:49.390 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 94}] 2022-03-16 17:14:40,445.445 2829:trainer.py:487 do_train_dict(): eta: 18:32:50 iter: 26700 speed: 294.3 images/sec total_norm: 139.1263 (141.1770) loss: 146.5380 (147.8299) masked_loss: 1.5377 (1.5350) tag_loss: 144.9714 (146.2949) time: 1.4334 (1.7395) data: 0.0001 (0.0001) to_device: 0.0052 (0.0051) time_gpu: 1.4283 (1.7342) save_time: 8.8805 (21.7526) lr: 0.000060 max mem: 26307 2022-03-16 17:14:40,807.807 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.48571428656578064 2022-03-16 17:14:40,808.808 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 144.53370666503906 2022-03-16 17:14:40,808.808 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.4103467713541 2022-03-16 17:14:54,775.775 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019228482618927956 2022-03-16 17:14:54,775.775 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 17:14:54,776.776 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'young', 'boy', 'sitting', '[MASK]', 'a', 'blue', 'toy', 'tractor', 'with', 'sheep', 'next', '[MASK]', 'him', '[MASK]', 'the', 'shadows', 'of', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 17:14:54,791.791 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shadow', 'ground', 'hair', 'boy', 'hand', 'head', 'wheel', 'ear', 'leg', 'sheep', '[UNK]', 'pig', 'baby', 'child', 'nose', 'tire', 'face', 'foot', 'arm', 'dirt', 'shirt', 'animal', 'mouth', 'shoe', 'toy', 'little', 'tractor', 'fence', 'vehicle', 'eye', 'young', 'small', 'back', 'post', 'truck', 'tail', 'pole', 'blue', 'handle', 'cart', 'jean', 'top', 'grass', 'rock', 'man', 'road', 'trunk', 'person', 'wool', 'wood'] 2022-03-16 17:15:10,821.821 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'hand', 'next', 'young', 'ground', 'hair', 'blue', 'child', 'arm', 'boy', 'shirt', 'animal', 'leg', 'vehicle', 'nose', 'ear', 'shadow', 'wheel', 'dirt', 'sheep', 'toy', 'straw', 'tractor'] 2022-03-16 17:17:34,311.311 2829:trainer.py:487 do_train_dict(): eta: 18:30:10 iter: 26800 speed: 294.5 images/sec total_norm: 136.9151 (139.9857) loss: 147.9747 (147.2288) masked_loss: 1.5170 (1.5567) tag_loss: 146.1845 (145.6721) time: 1.4322 (1.7388) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4271 (1.7336) save_time: 8.8805 (21.7526) lr: 0.000060 max mem: 26307 2022-03-16 17:17:34,673.673 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5 2022-03-16 17:17:34,673.673 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 139.01046752929688 2022-03-16 17:17:34,674.674 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.40696632640513 2022-03-16 17:17:48,717.717 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019247330725193024 2022-03-16 17:17:48,717.717 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 17:17:48,717.717 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'train', 'that', 'is', 'covered', '[MASK]', 'on', 'the', 'tracks', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 17:17:48,733.733 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['train', 'track', 'wheel', 'tree', 'door', 'sky', 'window', 'car', 'roof', 'gravel', '[UNK]', 'ground', 'building', 'pole', 'stripe', 'line', 'grass', 'bumper', 'cover', 'light', 'fence', 'top', 'plastic', 'handle', 'container', 'red', 'pipe', 'box', 'wall', 'front', 'background', 'paint', 'sign', 'vent', 'platform', 'post', 'old', 'bush', 'house', 'windshield', 'ladder', 'wire', 'engine', 'tank', 'trailer', 'person', 'hose', 'writing', 'tire', 'yard'] 2022-03-16 17:18:04,679.679 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'house', 'building', 'door', 'car', 'track', 'post', 'window', 'train', 'tree', 'sky', 'roof', 'wheel', 'pipe', 'lamp', 'gravel', 'container'] 2022-03-16 17:20:28,287.287 2829:trainer.py:487 do_train_dict(): eta: 18:27:31 iter: 26900 speed: 294.3 images/sec total_norm: 138.2182 (142.0513) loss: 147.9337 (150.1810) masked_loss: 1.5444 (1.5876) tag_loss: 146.2841 (148.5934) time: 1.4331 (1.7397) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4280 (1.7345) save_time: 8.8805 (21.7526) lr: 0.000059 max mem: 26307 2022-03-16 17:20:28,648.648 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6285714507102966 2022-03-16 17:20:28,648.648 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 132.73724365234375 2022-03-16 17:20:28,648.648 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.40635713647913 2022-03-16 17:20:42,578.578 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019264230504631996 2022-03-16 17:20:42,578.578 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 17:20:42,579.579 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'sign', 'at', 'the', 'corner', 'of', 'two', 'streets', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 17:20:42,594.594 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'tree', 'building', 'pole', 'sign', 'wire', 'roof', 'door', 'street', '[UNK]', 'light', 'road', 'power', 'window', 'bush', 'line', 'stop', 'car', 'tire', 'chimney', 'letter', 'clock', 'wall', 'grass', 'sidewalk', 'garage', 'house', 'traffic', 'parking', 'post', 'fence', 'front', 'truck', 'telephone', 'shadow', 'number', 'arrow', 'flag', 'suv', 'fire', 'tower', 'man', 'leaf', 'lot', 'flower', 'lamp', 'church', 'wheel', 'curb', 'white'] 2022-03-16 17:20:58,446.446 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['house', 'line', 'building', 'power', 'street', 'light', 'car', 'stop', 'mountain', 'tree', 'corner', 'letter', 'sign', 'sky', 'pole', 'telephone', 'chimney', 'graffiti'] 2022-03-16 17:23:22,333.333 2829:trainer.py:487 do_train_dict(): eta: 18:24:51 iter: 27000 speed: 294.2 images/sec total_norm: 137.8419 (140.2704) loss: 145.6339 (148.3549) masked_loss: 1.5770 (1.5956) tag_loss: 143.7361 (146.7593) time: 1.4351 (1.7405) data: 0.0001 (0.0001) to_device: 0.0052 (0.0051) time_gpu: 1.4298 (1.7353) save_time: 8.8805 (21.7526) lr: 0.000059 max mem: 26307 2022-03-16 17:23:22,697.697 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5 2022-03-16 17:23:22,697.697 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 130.75640869140625 2022-03-16 17:23:22,697.697 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.4195082970651 2022-03-16 17:23:36,794.794 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01930582895874977 2022-03-16 17:23:36,794.794 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 17:23:36,794.794 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'an', 'older', 'man', 'in', 'a', 'wind', '##breaker', 'standing', 'by', '[MASK]', 'street', 'while', '[MASK]', 'horse', 'walks', 'by', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 17:23:36,810.810 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['man', 'sky', 'tree', 'jacket', 'car', 'street', 'roof', 'window', 'shoe', 'road', 'building', 'short', '[UNK]', 'fence', 'head', 'hat', 'shirt', 'wagon', 'carriage', 'cart', 'horse', 'leg', 'wheel', 'person', 'sock', 'face', 'beard', 'pole', 'sunglasses', 'house', 'cap', 'hair', 'sidewalk', 'windshield', 'wall', 'sign', 'tire', 'harness', 'coat', 'neck', 'hand', 'paper', 'ground', 'glasses', 'nose', 'line', 'brick', 'railing', 'drawn', 'chimney'] 2022-03-16 17:23:52,804.804 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'house', 'hand', 'building', 'road', 'street', 'short', 'car', 'hair', 'person', 'wall', 'standing', 'paper', 'window', 'tree', 'horse', 'sky', 'shirt', 'roof', 'gate', 'wheel', 'hat', 'cap', 'pole', 'jacket', 'walks', 'fence', 'carriage', 'wagon', 'beard', 'shoe', 'cart', 'tire', 'sunglasses', 'harness', 'windshield', 'sock'] 2022-03-16 17:26:16,523.523 2829:trainer.py:487 do_train_dict(): eta: 18:22:12 iter: 27100 speed: 293.9 images/sec total_norm: 140.7691 (142.9792) loss: 147.4709 (147.8035) masked_loss: 1.4918 (1.5132) tag_loss: 145.9364 (146.2903) time: 1.4334 (1.7419) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4283 (1.7367) save_time: 8.8805 (21.7526) lr: 0.000059 max mem: 26307 2022-03-16 17:26:16,883.883 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.529411792755127 2022-03-16 17:26:16,884.884 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 152.34548950195312 2022-03-16 17:26:16,884.884 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.41437843266655 2022-03-16 17:26:31,045.045 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019328787922859192 2022-03-16 17:26:31,045.045 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 17:26:31,046.046 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'skier', 'kicks', '[MASK]', '[MASK]', 'while', 'riding', 'through', 'heavy', 'white', 'snow', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 17:26:31,061.061 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['snow', 'shadow', '[UNK]', 'person', 'jacket', 'man', 'ground', 'mountain', 'sky', 'ski', 'pole', 'head', 'hill', 'coat', 'skier', 'arm', 'glove', 'cloud', 'hat', 'leg', 'slope', 'rock', 'track', 'tree', 'snowy', 'hand', 'helmet', 'foot', 'backpack', 'boot', 'board', 'face', 'sign', 'background', 'top', 'shirt', 'woman', 'building', 'fence', 'hair', 'side', 'cap', 'sun', 'sunglasses', 'group', 'lift', 'steep', 'line', 'wire', 'day'] 2022-03-16 17:26:46,984.984 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'white', 'ground', 'board', 'person', 'hill', 'heavy', 'mountain', 'sky', 'leg', 'snow', 'shadow', 'cloud', 'pole', 'jacket', 'helmet', 'glove'] 2022-03-16 17:29:10,576.576 2829:trainer.py:487 do_train_dict(): eta: 18:19:32 iter: 27200 speed: 294.2 images/sec total_norm: 142.0314 (143.9162) loss: 149.3119 (151.0374) masked_loss: 1.5927 (1.6276) tag_loss: 147.6635 (149.4098) time: 1.4338 (1.7405) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4285 (1.7354) save_time: 8.8805 (21.7526) lr: 0.000059 max mem: 26307 2022-03-16 17:29:10,938.938 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5 2022-03-16 17:29:10,938.938 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 151.21751403808594 2022-03-16 17:29:10,938.938 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.4192025460603 2022-03-16 17:29:25,119.119 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019347792491316795 2022-03-16 17:29:25,119.119 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 17:29:25,119.119 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'men', 'in', 'front', 'of', '[MASK]', 'parking', '[MASK]', 'with', 'an', 'official', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 17:29:25,135.135 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['meter', 'car', 'sidewalk', 'tire', 'parking', 'street', 'pole', 'road', 'license', '[UNK]', 'shirt', 'plate', 'van', 'curb', 'tree', 'man', 'door', 'head', 'building', 'vehicle', 'light', 'window', 'hair', 'person', 'suv', 'hand', 'wheel', 'sign', 'truck', 'arm', 'line', 'leaf', 'tail', 'trunk', 'shoe', 'leg', 'windshield', 'short', 'bag', 'bumper', 'hat', 'mirror', 'woman', 'next', 'brick', 'handle', 'phone', 'sky', 'side', 'skirt'] 2022-03-16 17:29:41,124.124 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'building', 'road', 'front', 'street', 'light', 'short', 'car', 'hair', 'wall', 'official', 'van', 'paper', 'tree', 'sign', 'shirt', 'traffic', 'vehicle', 'roof', 'plate', 'parking', 'hat', 'license', 'cap', 'pole', 'glasses', 'meter', 'fence', 'shoe', 'flip', 'sidewalk', 'drain', 'tire', 'curb', 'flop'] 2022-03-16 17:32:04,830.830 2829:trainer.py:487 do_train_dict(): eta: 18:16:52 iter: 27300 speed: 293.8 images/sec total_norm: 138.5890 (141.6856) loss: 145.5451 (147.0515) masked_loss: 1.5990 (1.5987) tag_loss: 143.6497 (145.4527) time: 1.4339 (1.7425) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4289 (1.7374) save_time: 8.8805 (21.7526) lr: 0.000059 max mem: 26307 2022-03-16 17:32:05,191.191 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6176470518112183 2022-03-16 17:32:05,191.191 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 148.53842163085938 2022-03-16 17:32:05,192.192 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.41587903377784 2022-03-16 17:32:19,416.416 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019326167181134224 2022-03-16 17:32:19,416.416 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 17:32:19,417.417 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'desk', '[MASK]', '[MASK]', 'computer', 'and', 'a', 'tv', 'inside', 'of', 'a', 'room', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 17:32:19,432.432 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'leg', 'window', 'keyboard', 'screen', 'computer', 'floor', 'laptop', 'desk', 'table', 'mouse', 'monitor', 'speaker', 'room', 'pen', 'lamp', 'picture', '[UNK]', 'cup', 'pad', 'cord', 'chair', 'phone', 'stand', 'office', 'ball', 'outlet', 'top', 'box', 'television', 'base', 'shelf', 'pencil', 'door', 'light', 'toy', 'bowl', 'handle', 'remote', 'cell', 'ipod', 'camera', 'book', 'icon', 'mug', 'holder', 'poster', 'bottle', 'desktop', 'can'] 2022-03-16 17:32:35,375.375 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['room', 'cup', 'inside', 'tv', 'floor', 'table', 'wall', 'stand', 'computer', 'window', 'ball', 'picture', 'screen', 'leg', 'desk', 'clock', 'shadow', 'speaker', 'ceiling', 'switch', 'pen', 'mouse', 'monitor', 'alarm', 'keyboard', 'lamp', 'laptop', 'drawer', 'dresser'] 2022-03-16 17:34:59,132.132 2829:trainer.py:487 do_train_dict(): eta: 18:14:13 iter: 27400 speed: 293.7 images/sec total_norm: 137.9875 (139.2648) loss: 143.8305 (145.4532) masked_loss: 1.6015 (1.6724) tag_loss: 142.6362 (143.7808) time: 1.4342 (1.7431) data: 0.0001 (0.0002) to_device: 0.0052 (0.0050) time_gpu: 1.4289 (1.7379) save_time: 8.8805 (21.7526) lr: 0.000059 max mem: 26307 2022-03-16 17:34:59,493.493 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6363636255264282 2022-03-16 17:34:59,493.493 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 118.86441802978516 2022-03-16 17:34:59,493.493 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.43792420820756 2022-03-16 17:35:13,577.577 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019366972148418427 2022-03-16 17:35:13,577.577 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 17:35:13,578.578 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'and', 'his', 'owner', 'in', 'the', 'kitchen', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 17:35:13,593.593 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['dog', 'floor', 'eye', 'head', 'ear', 'cabinet', 'leg', 'person', 'refrigerator', 'rug', 'door', 'foot', 'nose', '[UNK]', 'paw', 'collar', 'drawer', 'man', 'neck', 'sock', 'kitchen', 'handle', 'wall', 'shoe', 'face', 'food', 'tail', 'mat', 'hand', 'cord', 'hair', 'stove', 'knob', 'white', 'brown', 'carpet', 'bottle', 'someone', 'cat', 'next', 'jean', 'wire', 'shirt', 'oven', 'paper', 'bag', 'something', 'open', 'top', 'small'] 2022-03-16 17:35:29,468.468 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'door', 'person', 'floor', 'wall', 'eye', 'neck', 'foot', 'kitchen', 'dog', 'owner', 'leg', 'nose', 'ear', 'handle', 'cabinet', 'collar', 'shoe', 'drawer', 'refrigerator', 'rug', 'paw'] 2022-03-16 17:37:53,499.499 2829:trainer.py:487 do_train_dict(): eta: 18:11:33 iter: 27500 speed: 293.6 images/sec total_norm: 138.7480 (142.2419) loss: 148.2450 (148.0545) masked_loss: 1.6379 (1.6286) tag_loss: 146.5880 (146.4260) time: 1.4346 (1.7436) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4294 (1.7385) save_time: 8.8805 (21.7526) lr: 0.000059 max mem: 26307 2022-03-16 17:37:53,859.859 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6060606241226196 2022-03-16 17:37:53,859.859 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 155.71047973632812 2022-03-16 17:37:53,859.859 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.4278347319451 2022-03-16 17:38:08,100.100 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019369781017303467 2022-03-16 17:38:08,100.100 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 17:38:08,101.101 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'is', 'standing', 'on', 'top', 'of', 'a', 'large', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 17:38:08,116.116 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'cloud', 'rock', 'head', 'sheep', 'face', 'leg', 'grass', 'moss', 'boulder', 'hill', 'wool', 'mountain', 'tree', 'cliff', 'top', 'goat', 'fur', 'rocky', 'ear', '[UNK]', 'blue', 'body', 'bush', 'ground', 'standing', 'animal', 'horn', 'stone', 'side', 'nose', 'large', 'branch', 'next', 'plant', 'hillside', 'ram', 'couple', 'cloudy', 'foot', 'mouth', 'white', 'day', 'area', 'green', 'wall', 'tail', 'group', 'black', 'steep'] 2022-03-16 17:38:24,038.038 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'face', 'large', 'top', 'rock', 'hill', 'mountain', 'sky', 'leg', 'grass', 'bush', 'cloud', 'sheep', 'moss', 'wool', 'boulder'] 2022-03-16 17:40:47,754.754 2829:trainer.py:487 do_train_dict(): eta: 18:08:53 iter: 27600 speed: 293.8 images/sec total_norm: 140.4390 (149.8653) loss: 147.6465 (148.3757) masked_loss: 1.6717 (1.6804) tag_loss: 145.7764 (146.6953) time: 1.4335 (1.7425) data: 0.0001 (0.0001) to_device: 0.0051 (0.0051) time_gpu: 1.4281 (1.7373) save_time: 8.8805 (21.7526) lr: 0.000058 max mem: 26307 2022-03-16 17:40:48,115.115 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7352941036224365 2022-03-16 17:40:48,115.115 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 149.34246826171875 2022-03-16 17:40:48,115.115 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.43072164101721 2022-03-16 17:41:02,389.389 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01937682181596756 2022-03-16 17:41:02,389.389 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 17:41:02,389.389 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'the', 'window', 'you', 'can', 'see', 'a', 'reflection', 'of', '##木', 'sea', 'and', 'beautiful', 'buildings', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 17:41:02,405.405 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['water', 'building', 'window', 'sky', 'boat', '[UNK]', 'floor', 'person', 'sign', 'balcony', 'pole', 'roof', 'glass', 'reflection', 'door', 'railing', 'flag', 'wall', 'tree', 'post', 'umbrella', 'table', 'clock', 'chair', 'river', 'frame', 'bridge', 'canopy', 'arch', 'leg', 'tower', 'dock', 'man', 'head', 'sidewalk', 'dome', 'hand', 'shirt', 'top', 'deck', 'front', 'light', 'bench', 'lamp', 'doorway', 'woman', 'ground', 'handle', 'large', 'letter'] 2022-03-16 17:41:18,375.375 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['water', 'building', 'sea', 'window', 'beautiful', 'letter', 'sign', 'bus', 'boat', 'mirror', 'dome', 'reflection', 'balcony', 'umbrella', 'stripe'] 03-16 17:42:49.489 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 17:42:49.489 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 17:42:50.801 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 17:43:42,162.162 2829:trainer.py:487 do_train_dict(): eta: 18:06:13 iter: 27700 speed: 293.6 images/sec total_norm: 137.5406 (139.4670) loss: 147.0752 (146.6626) masked_loss: 1.5796 (1.6098) tag_loss: 145.6542 (145.0528) time: 1.4337 (1.7441) data: 0.0001 (0.0005) to_device: 0.0052 (0.0051) time_gpu: 1.4284 (1.7385) save_time: 8.8805 (21.7526) lr: 0.000058 max mem: 26307 2022-03-16 17:43:42,522.522 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5833333134651184 2022-03-16 17:43:42,523.523 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 131.0507354736328 2022-03-16 17:43:42,523.523 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.44000179647541 2022-03-16 17:43:56,943.943 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019360285252332687 2022-03-16 17:43:56,944.944 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 17:43:56,944.944 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'laptop', 'computer', 'sitting', 'on', '[MASK]', 'of', 'a', '[MASK]', 'desk', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 17:43:56,959.959 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['computer', 'desk', 'monitor', 'keyboard', 'mouse', 'screen', 'table', 'laptop', 'apple', 'stand', 'wall', 'cord', '[UNK]', 'logo', 'base', 'speaker', 'pad', 'shelf', 'picture', 'phone', 'lamp', 'wire', 'tree', 'box', 'paper', 'book', 'light', 'pen', 'floor', 'desktop', 'plug', 'printer', 'icon', 'television', 'cell', 'office', 'window', 'handle', 'top', 'key', 'frame', 'chair', 'cup', 'glass', 'sign', 'front', 'bottle', 'curtain', 'room', 'tray'] 2022-03-16 17:44:12,917.917 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'white', 'top', 'light', 'office', 'table', 'wall', 'stand', 'computer', 'screen', 'desk', 'speaker', 'mouse', 'monitor', 'logo', 'keyboard', 'lamp', 'cord', 'laptop'] 2022-03-16 17:46:36,579.579 2829:trainer.py:487 do_train_dict(): eta: 18:03:33 iter: 27800 speed: 293.6 images/sec total_norm: 140.5508 (144.2280) loss: 146.3595 (147.4450) masked_loss: 1.6444 (1.6544) tag_loss: 144.7416 (145.7906) time: 1.4335 (1.7442) data: 0.0001 (0.0001) to_device: 0.0051 (0.0051) time_gpu: 1.4282 (1.7390) save_time: 8.8805 (21.7526) lr: 0.000058 max mem: 26307 2022-03-16 17:46:36,940.940 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.625 2022-03-16 17:46:36,940.940 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 165.9900665283203 2022-03-16 17:46:36,940.940 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.43810798617675 2022-03-16 17:46:51,473.473 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01936018280684948 2022-03-16 17:46:51,473.473 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 17:46:51,474.474 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'yellow', 'plane', 'prepares', 'to', '[MASK]', 'off', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 17:46:51,489.489 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'airplane', 'wheel', 'wing', 'grass', 'cloud', 'tail', 'cockpit', 'propeller', 'tree', 'ground', 'field', 'gear', 'shadow', 'runway', 'pilot', 'star', 'nose', 'yellow', 'landing', 'man', 'person', 'number', 'window', 'logo', 'bush', 'letter', 'small', 'plane', 'engine', '[UNK]', 'tire', 'building', 'front', 'blade', 'aircraft', 'blue', 'road', 'hedge', 'circle', 'fighter', 'cross', 'windshield', 'top', 'stripe', 'shirt', 'fence', 'white', 'jet', 'flag'] 2022-03-16 17:47:07,386.386 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['man', 'road', 'field', 'ground', 'star', 'cross', 'window', 'wing', 'tree', 'letter', 'sky', 'yellow', 'pilot', 'nose', 'tiny', 'plane', 'wheel', 'grass', 'tail', 'bush', 'cloud', 'runway', 'airplane', 'cockpit', 'propeller', 'hedge'] 2022-03-16 17:49:31,069.069 2829:trainer.py:487 do_train_dict(): eta: 18:00:53 iter: 27900 speed: 293.4 images/sec total_norm: 138.8524 (141.7900) loss: 149.2848 (149.3548) masked_loss: 1.5838 (1.5781) tag_loss: 147.7480 (147.7767) time: 1.4349 (1.7449) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4296 (1.7397) save_time: 8.8805 (21.7526) lr: 0.000058 max mem: 26307 2022-03-16 17:49:31,432.432 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6176470518112183 2022-03-16 17:49:31,433.433 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 159.7484130859375 2022-03-16 17:49:31,433.433 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.43593192781721 2022-03-16 17:49:46,010.010 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019337672740221024 2022-03-16 17:49:46,010.010 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 17:49:46,010.010 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'brown', 'and', 'white', 'cow', 'drinking', 'from', 'black', 'sp', '##out', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 17:49:46,025.025 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['mouth', 'nose', 'eye', 'head', 'cow', 'ear', 'face', 'grass', 'horn', 'neck', '[UNK]', 'collar', 'spot', 'plant', 'fence', 'ground', 'wall', 'tree', 'leg', 'weed', 'dirt', 'water', 'chin', 'tag', 'tongue', 'white', 'hair', 'rope', 'bush', 'field', 'rock', 'harness', 'buckle', 'snout', 'bell', 'flower', 'leaf', 'brown', 'hay', 'tail', 'patch', 'number', 'next', 'fur', 'pole', 'black', 'lip', 'close', 'chain', 'metal'] 2022-03-16 17:50:01,940.940 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'number', 'face', 'black', 'white', 'ground', 'mouth', 'eye', 'neck', 'spot', 'tongue', 'nose', 'ear', 'shadow', 'lip', 'grass', 'drinking', 'tag', 'horn', 'collar', 'cow', 'weed', 'hose', 'buckle'] 2022-03-16 17:52:25,466.466 2829:trainer.py:487 do_train_dict(): eta: 17:58:13 iter: 28000 speed: 293.6 images/sec total_norm: 141.1687 (143.0833) loss: 146.3469 (146.2297) masked_loss: 1.5239 (1.5871) tag_loss: 144.7503 (144.6426) time: 1.4330 (1.7440) data: 0.0001 (0.0001) to_device: 0.0051 (0.0051) time_gpu: 1.4278 (1.7387) save_time: 8.8805 (21.7526) lr: 0.000058 max mem: 26307 2022-03-16 17:52:25,829.829 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-16 17:52:25,829.829 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 148.07962036132812 2022-03-16 17:52:25,829.829 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.43938268206722 2022-03-16 17:52:40,292.292 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019387340173125267 2022-03-16 17:52:40,293.293 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 17:52:40,293.293 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'zebra', 'eating', 'grass', 'behind', 'a', 'fence', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 17:52:40,309.309 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['fence', 'building', 'wall', 'zebra', 'leg', 'post', 'tree', 'door', 'bush', 'trunk', 'ground', 'pole', 'leaf', 'plant', 'zoo', 'window', 'head', '[UNK]', 'enclosure', 'branch', 'dirt', 'tail', 'mane', 'stripe', 'ear', 'brick', 'wire', 'grass', 'flower', 'log', 'next', 'mouth', 'rock', 'pillar', 'front', 'neck', 'enclosed', 'garden', 'pen', 'gate', 'area', 'hair', 'blade', 'light', 'house', 'sign', 'nose', 'wood', 'box', 'standing'] 2022-03-16 17:52:56,245.245 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['building', 'door', 'ground', 'post', 'wall', 'plant', 'window', 'tree', 'leg', 'brick', 'grass', 'blade', 'pole', 'dirt', 'leaf', 'trunk', 'fence', 'zoo', 'zebra'] 2022-03-16 17:55:20,109.109 2829:trainer.py:487 do_train_dict(): eta: 17:55:33 iter: 28100 speed: 293.2 images/sec total_norm: 138.6513 (143.0182) loss: 146.6214 (146.7393) masked_loss: 1.5493 (1.5936) tag_loss: 144.5963 (145.1458) time: 1.4345 (1.7464) data: 0.0001 (0.0002) to_device: 0.0052 (0.0051) time_gpu: 1.4291 (1.7412) save_time: 8.8805 (21.7526) lr: 0.000058 max mem: 26307 2022-03-16 17:55:20,470.470 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5454545617103577 2022-03-16 17:55:20,471.471 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 127.68038177490234 2022-03-16 17:55:20,471.471 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.43796056382199 2022-03-16 17:55:35,057.057 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019437750801444054 2022-03-16 17:55:35,058.058 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 17:55:35,058.058 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'of', '[MASK]', 'with', 'luggage', 'at', 'an', 'airport', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 17:55:35,073.073 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'sign', 'man', 'floor', 'person', 'woman', 'short', 'airport', 'cart', 'ceiling', '[UNK]', 'wall', 'bag', 'hat', 'light', 'number', 'wheel', 'television', 'luggage', 'backpack', 'arrow', 'suitcase', 'building', 'chair', 'head', 'lady', 'shoe', 'flop', 'hand', 'cap', 'group', 'hair', 'foot', 'skirt', 'clock', 'jean', 'letter', 'arm', 'screen', 'line', 'glasses', 'boy', 'wheelchair', 'leg', 'girl', 'door', 'pillar', 'jacket', 'window', 'pole'] 2022-03-16 17:55:51,044.044 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'group', 'hand', 'number', 'line', 'light', 'woman', 'short', 'television', 'hair', 'person', 'floor', 'wall', 'date', 'airport', 'lady', 'chair', 'box', 'sign', 'shirt', 'bag', 'wheel', 'ceiling', 'column', 'hat', 'cap', 'arrow', 'shoe', 'cart', 'backpack', 'pillar', 'luggage'] 2022-03-16 17:58:14,627.627 2829:trainer.py:487 do_train_dict(): eta: 17:52:53 iter: 28200 speed: 293.4 images/sec total_norm: 141.2034 (143.5171) loss: 143.7106 (147.1741) masked_loss: 1.5716 (1.6050) tag_loss: 142.2991 (145.5690) time: 1.4339 (1.7452) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4286 (1.7400) save_time: 8.8805 (21.7526) lr: 0.000058 max mem: 26307 2022-03-16 17:58:14,987.987 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6285714507102966 2022-03-16 17:58:14,987.987 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 133.5584716796875 2022-03-16 17:58:14,988.988 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.44672195650243 2022-03-16 17:58:29,673.673 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019422173500061035 2022-03-16 17:58:29,673.673 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 17:58:29,674.674 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'the', 'boat', 'is', 'in', 'the', 'lake', ',', 'one', 'is', 'red', 'and', '[MASK]', 'and', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 17:58:29,689.689 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['water', 'boat', 'tree', 'building', 'roof', 'window', '[UNK]', 'wall', 'house', 'windshield', 'grass', 'moss', 'rock', 'shore', 'cabin', 'river', 'hill', 'small', 'chimney', 'shed', 'sky', 'car', 'ground', 'door', 'sign', 'forest', 'light', 'top', 'person', 'bank', 'reflection', 'ball', 'fence', 'puddle', 'mud', 'bush', 'number', 'rope', 'dock', 'branch', 'plant', 'white', 'old', 'front', 'wave', 'blue', 'wood', 'tire', 'next', 'body'] 2022-03-16 17:58:45,656.656 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'house', 'water', 'building', 'white', 'door', 'red', 'car', 'blue', 'post', 'lake', 'wall', 'hill', 'window', 'tree', 'boat', 'roof', 'grass', 'cone', 'chimney', 'windshield', 'puddle'] 2022-03-16 18:01:09,314.314 2829:trainer.py:487 do_train_dict(): eta: 17:50:13 iter: 28300 speed: 293.1 images/sec total_norm: 142.0648 (145.7118) loss: 147.4682 (149.7926) masked_loss: 1.5994 (1.6333) tag_loss: 145.7122 (148.1592) time: 1.4338 (1.7469) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4288 (1.7419) save_time: 8.8805 (21.7526) lr: 0.000057 max mem: 26307 2022-03-16 18:01:09,678.678 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5 2022-03-16 18:01:09,678.678 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 132.0526580810547 2022-03-16 18:01:09,678.678 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.45548502156433 2022-03-16 18:01:24,317.317 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01942465826869011 2022-03-16 18:01:24,318.318 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 18:01:24,318.318 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'of', 'cows', 'walking', 'down', 'areas', 'road', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 18:01:24,333.333 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['truck', 'cow', 'pole', 'ground', 'rock', 'windshield', 'mountain', '[UNK]', 'tire', 'grass', 'goat', 'road', 'post', 'cab', 'wheel', 'dirt', 'animal', 'flag', 'man', 'wall', 'door', 'leg', 'sign', 'cover', 'front', 'trailer', 'herd', 'fence', 'mirror', 'tail', 'shirt', 'sheep', 'license', 'bull', 'block', 'person', 'hill', 'light', 'window', 'tree', 'van', 'horse', 'group', 'sky', 'banner', 'calf', 'cattle', 'bumper', 'number', 'head'] 2022-03-16 18:01:40,237.237 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'road', 'ground', 'rock', 'post', 'star', 'cover', 'mountain', 'letter', 'animal', 'truck', 'flag', 'grass', 'bush', 'pole', 'dirt', 'logo', 'sheep', 'fence', 'cab', 'cow', 'tire', 'cone', 'goat', 'herd', 'tractor', 'windshield'] 2022-03-16 18:04:04,388.388 2829:trainer.py:487 do_train_dict(): eta: 17:47:34 iter: 28400 speed: 292.5 images/sec total_norm: 141.2142 (142.4258) loss: 149.6705 (148.8117) masked_loss: 1.5869 (1.6020) tag_loss: 148.1572 (147.2096) time: 1.4349 (1.7507) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4299 (1.7455) save_time: 8.8805 (21.7526) lr: 0.000057 max mem: 26307 2022-03-16 18:04:04,749.749 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7297297120094299 2022-03-16 18:04:04,749.749 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 143.73825073242188 2022-03-16 18:04:04,750.750 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.45856715419836 2022-03-16 18:04:19,544.544 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019467096775770187 2022-03-16 18:04:19,544.544 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 18:04:19,544.544 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'carrot', '##s', 'hang', 'tied', 'together', 'on', 'a', 'pole', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 18:04:19,559.559 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['leaf', 'carrot', '[UNK]', 'vegetable', 'stem', 'bunch', 'top', 'table', 'pile', 'other', 'ground', 'plant', 'large', 'fresh', 'orange', 'wall', 'cloth', 'person', 'plastic', 'full', 'hand', 'ring', 'close', 'light', 'basket', 'different', 'banana', 'paper', 'next', 'flower', 'head', 'background', 'various', 'white', 'group', 'pepper', 'green', 'tag', 'garden', 'sign', 'tree', 'bag', 'red', 'dirt', 'band', 'small', 'market', 'sweet', 'many', 'potato'] 2022-03-16 18:04:35,412.412 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'top', 'table', 'pole', 'leaf', 'bunch', 'vegetable', 'carrot'] 2022-03-16 18:06:59,115.115 2829:trainer.py:487 do_train_dict(): eta: 17:44:53 iter: 28500 speed: 293.0 images/sec total_norm: 138.6740 (140.2635) loss: 146.8426 (146.2962) masked_loss: 1.5447 (1.5367) tag_loss: 145.6845 (144.7594) time: 1.4339 (1.7473) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4286 (1.7421) save_time: 8.8805 (21.7526) lr: 0.000057 max mem: 26307 2022-03-16 18:06:59,476.476 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6363636255264282 2022-03-16 18:06:59,476.476 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 158.02423095703125 2022-03-16 18:06:59,477.477 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.45932896487362 2022-03-16 18:07:14,200.200 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019464680925011635 2022-03-16 18:07:14,201.201 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 18:07:14,201.201 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'several', 'motorcycles', 'parked', 'in', 'rows', 'on', 'display', 'in', 'a', '##⁄₄', 'lot', '##zi', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 18:07:14,216.216 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['motorcycle', 'man', 'tire', 'shirt', 'bike', 'person', 'seat', 'jean', '[UNK]', 'wheel', 'street', 'tank', 'helmet', 'short', 'light', 'engine', 'woman', 'ground', 'line', 'road', 'building', 'fender', 'gas', 'tree', 'sign', 'shoe', 'pavement', 'logo', 'hat', 'lot', 'parking', 'car', 'bag', 'windshield', 'hair', 'mirror', 'pipe', 'red', 'pole', 'sidewalk', 'sky', 'dress', 'window', 'next', 'sunglasses', 'row', 'other', 'parked', 'exhaust', 'crowd'] 2022-03-16 18:07:30,111.111 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'hand', 'building', 'road', 'street', 'short', 'hair', 'person', 'seat', 'lot', 'arm', 'tree', 'sign', 'jean', 'shirt', 'gas', 'display', 'tank', 'wheel', 'parking', 'bike', 'pipe', 'motorcycle', 'helmet', 'tire', 'pavement', 'fender', 'windshield'] 2022-03-16 18:09:53,963.963 2829:trainer.py:487 do_train_dict(): eta: 17:42:13 iter: 28600 speed: 292.8 images/sec total_norm: 137.4933 (139.8967) loss: 145.6241 (147.6192) masked_loss: 1.5913 (1.6291) tag_loss: 144.3860 (145.9901) time: 1.4332 (1.7485) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4281 (1.7433) save_time: 8.8805 (21.7526) lr: 0.000057 max mem: 26307 2022-03-16 18:09:54,324.324 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6363636255264282 2022-03-16 18:09:54,325.325 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 148.09474182128906 2022-03-16 18:09:54,325.325 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.4683858226816 2022-03-16 18:10:09,157.157 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01946013793349266 2022-03-16 18:10:09,158.158 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 18:10:09,158.158 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'cat', 'that', '[MASK]', 'looking', 'at', '[MASK]', 'television', 'screen', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 18:10:09,174.174 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['cat', 'wall', 'ear', 'television', 'head', 'screen', 'table', 'logo', 'tail', 'paw', 'curtain', 'bowl', 'stand', 'paper', 'stripe', 'leg', 'light', 'window', 'box', '[UNK]', 'cord', 'lid', 'door', 'button', 'book', 'eye', 'container', 'room', 'reflection', 'nose', 'cover', 'shelf', 'device', 'top', 'toilet', 'tv', 'monitor', 'glass', 'seat', 'black', 'dvd', 'remote', 'cd', 'computer', 'basket', 'base', 'bottle', 'wire', 'tag', 'front'] 2022-03-16 18:10:25,155.155 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'book', 'television', 'table', 'wall', 'writing', 'box', 'screen', 'card', 'leg', 'ear', 'bowl', 'cat', 'wire', 'logo', 'cloth', 'curtain', 'cord', 'lid', 'paw'] 2022-03-16 18:12:48,639.639 2829:trainer.py:487 do_train_dict(): eta: 17:39:33 iter: 28700 speed: 293.1 images/sec total_norm: 142.9664 (144.5648) loss: 148.8179 (149.9991) masked_loss: 1.6053 (1.6532) tag_loss: 147.2126 (148.3459) time: 1.4318 (1.7467) data: 0.0001 (0.0001) to_device: 0.0052 (0.0051) time_gpu: 1.4265 (1.7415) save_time: 8.8805 (21.7526) lr: 0.000057 max mem: 26307 2022-03-16 18:12:49,001.001 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4545454680919647 2022-03-16 18:12:49,001.001 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 173.98463439941406 2022-03-16 18:12:49,001.001 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.45268759462569 03-16 18:12:50.866 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 18:12:50.866 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 18:12:51.558 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}] 2022-03-16 18:13:03,809.809 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019492145627737045 2022-03-16 18:13:03,809.809 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 18:13:03,809.809 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'several', '[MASK]', 'types', 'of', 'tools', 'are', 'arranged', 'on', 'the', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 18:13:03,824.824 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['scissors', 'handle', 'table', 'blade', 'cloth', '[UNK]', 'pair', 'net', 'wire', 'mat', 'tape', 'string', 'hole', 'floor', 'green', 'ground', 'cord', 'top', 'line', 'pen', 'band', 'other', 'screw', 'paper', 'cap', 'ball', 'blanket', 'number', 'surface', 'wall', 'next', 'plastic', 'bowl', 'ribbon', 'blue', 'circle', 'eye', 'spot', 'towel', 'logo', 'board', 'red', 'man', 'tool', 'different', 'shadow', 'bolt', 'letter', 'bunch', 'bag'] 2022-03-16 18:13:19,711.711 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'band', 'top', 'different', 'table', 'paper', 'metal', 'label', 'bottom', 'spot', 'handle', 'string', 'bottle', 'plastic', 'cap', 'cloth', 'mat', 'banana', 'scissors'] 2022-03-16 18:15:43,709.709 2829:trainer.py:487 do_train_dict(): eta: 17:36:53 iter: 28800 speed: 292.5 images/sec total_norm: 138.5986 (144.3402) loss: 147.9702 (148.6646) masked_loss: 1.6064 (1.6054) tag_loss: 146.7345 (147.0592) time: 1.4345 (1.7507) data: 0.0001 (0.0005) to_device: 0.0051 (0.0049) time_gpu: 1.4292 (1.7453) save_time: 8.8805 (21.7526) lr: 0.000057 max mem: 26307 2022-03-16 18:15:44,070.070 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5833333134651184 2022-03-16 18:15:44,071.071 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 142.14830017089844 2022-03-16 18:15:44,071.071 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.44963449920337 2022-03-16 18:15:58,981.981 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019483480602502823 2022-03-16 18:15:58,981.981 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 18:15:58,981.981 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'bowl', 'of', 'sh', '##ree', '##ded', '[MASK]', '##s', 'in', 'milk', 'in', 'and', 'a', 'half', 'eaten', '[MASK]', '##nut', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 18:15:58,996.996 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'table', 'hole', 'plate', 'spoon', 'bowl', 'cereal', 'chocolate', 'handle', 'reflection', 'food', 'cream', 'nut', 'napkin', 'fork', 'cup', 'glass', 'milk', 'paper', 'white', 'ice', 'dessert', 'desert', 'dish', 'cookie', 'light', 'next', 'chip', 'rim', 'container', 'top', 'half', 'eaten', 'coffee', 'line', 'knife', 'bacon', 'liquid', 'box', 'small', 'sugar', 'bottom', 'blue', 'slice', 'hand', 'drink', 'mug', 'almond', 'couple', 'leg'] 2022-03-16 18:16:14,907.907 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'half', 'table', 'food', 'paper', 'wood', 'bowl', 'hole', 'handle', 'plate', 'meat', 'milk', 'pen', 'chocolate', 'eaten', 'spoon'] 2022-03-16 18:18:38,571.571 2829:trainer.py:487 do_train_dict(): eta: 17:34:12 iter: 28900 speed: 292.8 images/sec total_norm: 139.8740 (142.7727) loss: 146.2504 (147.4974) masked_loss: 1.5786 (1.6387) tag_loss: 144.3844 (145.8588) time: 1.4325 (1.7487) data: 0.0001 (0.0002) to_device: 0.0052 (0.0051) time_gpu: 1.4274 (1.7434) save_time: 8.8805 (21.7526) lr: 0.000056 max mem: 26307 2022-03-16 18:18:38,932.932 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7058823704719543 2022-03-16 18:18:38,933.933 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 140.38246154785156 2022-03-16 18:18:38,933.933 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.4415922888394 2022-03-16 18:18:53,793.793 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019552214071154594 2022-03-16 18:18:53,793.793 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 18:18:53,794.794 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'tray', 'filled', 'with', 'po', '[MASK]', '##gra', '[MASK]', '##s', 'and', 'other', 'cut', 'fruits', 'to', 'eat', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 18:18:53,809.809 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['fruit', 'apple', 'stem', 'table', '[UNK]', 'bunch', 'pile', 'red', 'spot', 'pear', 'plate', 'bowl', 'banana', 'carrot', 'potato', 'top', 'full', 'vegetable', 'strawberry', 'flower', 'other', 'orange', 'onion', 'next', 'group', 'many', 'writing', 'board', 'paper', 'box', 'tray', 'wall', 'different', 'cardboard', 'variety', 'berry', 'various', 'grape', 'container', 'close', 'hole', 'end', 'bag', 'plastic', 'ripe', 'sign', 'white', 'skin', 'fresh', 'large'] 2022-03-16 18:19:09,632.632 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'other', 'cut', 'orange', 'bowl', 'fruit', 'plastic', 'apple', 'stem', 'container', 'tray', 'banana'] 2022-03-16 18:21:33,653.653 2829:trainer.py:487 do_train_dict(): eta: 17:31:32 iter: 29000 speed: 292.4 images/sec total_norm: 138.8702 (141.7669) loss: 148.8365 (150.4626) masked_loss: 1.5171 (1.5620) tag_loss: 147.5899 (148.9006) time: 1.4336 (1.7507) data: 0.0001 (0.0001) to_device: 0.0052 (0.0051) time_gpu: 1.4282 (1.7455) save_time: 8.8805 (21.7526) lr: 0.000056 max mem: 26307 2022-03-16 18:21:34,012.012 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6470588445663452 2022-03-16 18:21:34,013.013 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 128.46090698242188 2022-03-16 18:21:34,013.013 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.4574932216369 2022-03-16 18:21:49,079.079 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019599059596657753 2022-03-16 18:21:49,079.079 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 18:21:49,080.080 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'on', 'a', 'motorcycle', 'holding', '[MASK]', 'dog', 'while', 'looking', 'down', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 18:21:49,095.095 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hair', 'tree', 'man', 'jacket', 'dog', 'head', 'bush', 'hand', 'grass', 'stripe', 'face', 'shirt', 'background', 'motorcycle', '[UNK]', 'road', 'mirror', 'sky', 'seat', 'house', 'vest', 'glasses', 'bike', 'bag', 'ear', 'arm', 'building', 'window', 'back', 'pole', 'fence', 'car', 'coat', 'wheel', 'tire', 'shoulder', 'person', 'mouth', 'nose', 'leg', 'shoe', 'hood', 'trunk', 'collar', 'ground', 'jean', 'sidewalk', 'strap', 'chair', 'harness'] 2022-03-16 18:22:05,030.030 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'hand', 'face', 'road', 'hair', 'arm', 'tree', 'shirt', 'dog', 'background', 'nose', 'mirror', 'grass', 'bush', 'jacket', 'fence', 'motorcycle', 'vest', 'stripe'] 2022-03-16 18:24:28,534.534 2829:trainer.py:487 do_train_dict(): eta: 17:28:51 iter: 29100 speed: 292.8 images/sec total_norm: 142.3582 (144.3920) loss: 146.1716 (149.5215) masked_loss: 1.5811 (1.6037) tag_loss: 144.0772 (147.9178) time: 1.4338 (1.7489) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4289 (1.7439) save_time: 8.8805 (21.7526) lr: 0.000056 max mem: 26307 2022-03-16 18:24:28,897.897 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5142857432365417 2022-03-16 18:24:28,897.897 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 147.3625030517578 2022-03-16 18:24:28,898.898 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.46819223116522 2022-03-16 18:24:44,114.114 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01960493065416813 2022-03-16 18:24:44,114.114 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 18:24:44,115.115 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'person', '[MASK]', 'on', 'the', 'snow', 'covered', 'mountain', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 18:24:44,130.130 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', '[UNK]', 'jacket', 'snow', 'sky', 'person', 'helmet', 'ground', 'head', 'man', 'glove', 'ski', 'pole', 'coat', 'skier', 'hand', 'face', 'foot', 'stripe', 'rock', 'hill', 'arm', 'hood', 'snowy', 'slope', 'leg', 'hat', 'pine', 'boot', 'leaf', 'mountain', 'board', 'top', 'steep', 'logo', 'plant', 'side', 'downhill', 'bush', 'stick', 'jump', 'air', 'design', 'branch', 'backpack', 'day', 'suit', 'track', 'poles', 'forest'] 2022-03-16 18:25:00,133.133 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'ground', 'person', 'mountain', 'plant', 'foot', 'tree', 'sky', 'clothes', 'snow', 'coat', 'pole', 'jacket', 'pine', 'logo', 'ski', 'boot', 'helmet', 'glove', 'stripe', 'skier'] 2022-03-16 18:27:23,560.560 2829:trainer.py:487 do_train_dict(): eta: 17:26:11 iter: 29200 speed: 292.5 images/sec total_norm: 141.7175 (143.5716) loss: 149.6364 (148.0433) masked_loss: 1.5976 (1.5875) tag_loss: 148.1038 (146.4559) time: 1.4328 (1.7503) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4276 (1.7451) save_time: 8.8805 (21.7526) lr: 0.000056 max mem: 26307 2022-03-16 18:27:23,921.921 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6666666865348816 2022-03-16 18:27:23,921.921 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 115.51433563232422 2022-03-16 18:27:23,922.922 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.48557238529973 2022-03-16 18:27:38,955.955 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019604241475462914 2022-03-16 18:27:38,955.955 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 18:27:38,955.955 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'motorcycle', 'parked', 'in', 'a', 'city', 'street', 'with', 'houses', '[MASK]', 'the', 'background', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 18:27:38,971.971 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['motorcycle', 'tire', 'building', 'window', 'bike', 'tree', 'seat', 'sign', 'wheel', 'roof', 'car', 'mirror', '[UNK]', 'windshield', 'ground', 'road', 'street', 'sky', 'engine', 'pipe', 'light', 'house', 'logo', 'fender', 'brick', 'pole', 'exhaust', 'gas', 'leaf', 'chimney', 'handle', 'door', 'parked', 'black', 'spoke', 'rim', 'side', 'grass', 'sidewalk', 'front', 'wall', 'next', 'box', 'city', 'tank', 'lot', 'parking', 'license', 'plate', 'bush'] 2022-03-16 18:27:54,802.802 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'city', 'house', 'building', 'door', 'road', 'street', 'light', 'car', 'ground', 'seat', 'engine', 'window', 'tree', 'box', 'sign', 'background', 'roof', 'wheel', 'mirror', 'grass', 'leaf', 'garage', 'globe', 'bike', 'pipe', 'motorcycle', 'rim', 'tire', 'suv', 'windshield'] 2022-03-16 18:30:18,591.591 2829:trainer.py:487 do_train_dict(): eta: 17:23:30 iter: 29300 speed: 292.5 images/sec total_norm: 140.8023 (143.9691) loss: 146.9014 (148.8958) masked_loss: 1.5923 (1.5947) tag_loss: 145.8006 (147.3011) time: 1.4321 (1.7502) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4270 (1.7450) save_time: 8.8805 (21.7526) lr: 0.000056 max mem: 26307 2022-03-16 18:30:18,951.951 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.47058823704719543 2022-03-16 18:30:18,951.951 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 145.95120239257812 2022-03-16 18:30:18,951.951 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.48794665952929 2022-03-16 18:30:34,009.009 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019610542804002762 2022-03-16 18:30:34,010.010 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 18:30:34,010.010 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'delicious', 'looking', 'plate', 'of', 'fresh', 'bro', '##cco', '[MASK]', 'and', 'some', 'type', 'of', 'meat', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 18:30:34,026.026 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['plate', 'table', 'chicken', 'food', '[UNK]', 'meat', 'bread', 'fork', 'carrot', 'white', 'potato', 'handle', 'mushroom', 'crust', 'glass', 'knife', 'napkin', 'shadow', 'tomato', 'cup', 'fish', 'sauce', 'onion', 'stem', 'vegetable', 'bun', 'design', 'pizza', 'slice', 'bowl', 'cheese', 'sandwich', 'reflection', 'top', 'leaf', 'pepper', 'spoon', 'piece', 'logo', 'base', 'spot', 'ham', 'meal', 'sausage', 'light', 'bean', 'salad', 'wall', 'shrimp', 'bottle'] 2022-03-16 18:30:49,883.883 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'cup', 'table', 'type', 'food', 'fresh', 'handle', 'plate', 'meat', 'stem', 'chicken', 'fork'] 2022-03-16 18:33:13,729.729 2829:trainer.py:487 do_train_dict(): eta: 17:20:50 iter: 29400 speed: 292.3 images/sec total_norm: 142.0972 (144.0886) loss: 148.7812 (150.4367) masked_loss: 1.6246 (1.5846) tag_loss: 147.6292 (148.8521) time: 1.4328 (1.7514) data: 0.0001 (0.0002) to_device: 0.0052 (0.0050) time_gpu: 1.4277 (1.7462) save_time: 8.8805 (21.7526) lr: 0.000056 max mem: 26307 2022-03-16 18:33:14,089.089 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6000000238418579 2022-03-16 18:33:14,090.090 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 137.72116088867188 2022-03-16 18:33:14,090.090 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.50300521850586 2022-03-16 18:33:29,295.295 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019644897431135178 2022-03-16 18:33:29,295.295 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 18:33:29,295.295 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'woman', 'watches', 'as', 'a', 'bald', 'eagle', 'flies', 'by', 'her', '[MASK]', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 18:33:29,311.311 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['head', 'hair', 'face', 'bird', 'tree', 'woman', 'mouth', 'wing', 'eagle', 'eye', 'feather', 'tail', 'beak', 'shirt', 'girl', 'nose', 'bush', 'foot', 'arm', 'neck', 'person', 'air', 'hand', 'ear', 'leg', '[UNK]', 'top', 'man', 'white', 'glove', 'jacket', 'black', 'teeth', 'sky', 'background', 'name', 'image', 'plant', 'strap', 'dress', 'watch', 'young', 'wrist', 'chest', 'shoulder', 'couple', 'jean', 'beautiful', 'large', 'necklace'] 2022-03-16 18:33:45,204.204 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'face', 'woman', 'hair', 'mouth', 'post', 'eye', 'wing', 'tree', 'shirt', 'leg', 'nose', 'bird', 'tail', 'eagle', 'fence', 'bald', 'feather', 'beak'] 2022-03-16 18:36:08,769.769 2829:trainer.py:487 do_train_dict(): eta: 17:18:09 iter: 29500 speed: 292.5 images/sec total_norm: 139.5855 (142.6077) loss: 146.1849 (146.5057) masked_loss: 1.6106 (1.5761) tag_loss: 144.9258 (144.9296) time: 1.4323 (1.7504) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4269 (1.7452) save_time: 8.8805 (21.7526) lr: 0.000056 max mem: 26307 2022-03-16 18:36:09,129.129 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5714285969734192 2022-03-16 18:36:09,129.129 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 151.2039031982422 2022-03-16 18:36:09,129.129 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.50467484706157 2022-03-16 18:36:24,480.480 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01969200186431408 2022-03-16 18:36:24,480.480 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 18:36:24,481.481 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'an', 'older', 'model', 'delivery', 'truck', '[MASK]', 'in', 'front', '[MASK]', 'some', 'trees', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 18:36:24,496.496 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tire', '[UNK]', 'truck', 'grill', 'windshield', 'wheel', 'ground', 'bumper', 'mirror', 'window', 'tree', 'dirt', 'light', 'door', 'road', 'grass', 'front', 'sky', 'steering', 'bed', 'building', 'license', 'rim', 'hood', 'logo', 'roof', 'plate', 'pole', 'trailer', 'fence', 'car', 'person', 'number', 'mud', 'old', 'wall', 'next', 'white', 'large', 'sign', 'lot', 'emblem', 'bus', 'fender', 'man', 'side', 'vehicle', 'step', 'head', 'field'] 2022-03-16 18:36:40,518.518 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'door', 'road', 'front', 'light', 'car', 'ground', 'model', 'cover', 'window', 'step', 'tree', 'sky', 'picture', 'truck', 'plate', 'wheel', 'mirror', 'grass', 'license', 'delivery', 'logo', 'steering', 'rim', 'tire', 'emblem', 'grill', 'windshield', 'bumper'] 2022-03-16 18:39:04,228.228 2829:trainer.py:487 do_train_dict(): eta: 17:15:28 iter: 29600 speed: 291.8 images/sec total_norm: 139.1917 (142.5527) loss: 148.6858 (148.5452) masked_loss: 1.4876 (1.5223) tag_loss: 147.4291 (147.0229) time: 1.4338 (1.7546) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4286 (1.7494) save_time: 8.8805 (21.7526) lr: 0.000055 max mem: 26307 2022-03-16 18:39:04,588.588 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-16 18:39:04,589.589 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 148.98666381835938 2022-03-16 18:39:04,589.589 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.51333668256046 2022-03-16 18:39:19,833.833 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019648294895887375 2022-03-16 18:39:19,833.833 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 18:39:19,833.833 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'kitchen', 'setting', 'with', 'white', '[MASK]', '##lian', '##ce', 'and', '[MASK]', 'counters', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 18:39:19,849.849 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['refrigerator', 'wall', 'kitchen', '[UNK]', 'floor', 'door', 'cabinet', 'window', 'handle', 'ceiling', 'box', 'blind', 'microwave', 'light', 'drawer', 'sink', 'towel', 'stove', 'cord', 'tile', 'bag', 'outlet', 'shelf', 'oven', 'white', 'vent', 'magnet', 'basket', 'bottle', 'room', 'table', 'chair', 'mirror', 'paper', 'switch', 'top', 'fridge', 'can', 'knob', 'pot', 'bowl', 'trash', 'counter', 'fan', 'rack', 'cup', 'hood', 'maker', 'reflection', 'picture'] 2022-03-16 18:39:35,745.745 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'can', 'white', 'door', 'light', 'floor', 'wall', 'chair', 'window', 'box', 'wood', 'kitchen', 'handle', 'cabinet', 'ceiling', 'blind', 'sink', 'pot', 'towel', 'trash', 'lid', 'outlet', 'mat', 'tile', 'stove', 'refrigerator', 'microwave', 'vent', 'rug'] 2022-03-16 18:41:59,396.396 2829:trainer.py:487 do_train_dict(): eta: 17:12:48 iter: 29700 speed: 292.3 images/sec total_norm: 140.2098 (144.2407) loss: 148.6682 (147.7390) masked_loss: 1.5420 (1.5807) tag_loss: 146.2836 (146.1583) time: 1.4330 (1.7517) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4279 (1.7465) save_time: 8.8805 (21.7526) lr: 0.000055 max mem: 26307 2022-03-16 18:41:59,757.757 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.39393940567970276 2022-03-16 18:41:59,757.757 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 141.75442504882812 2022-03-16 18:41:59,758.758 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.52547696772838 2022-03-16 18:42:15,191.191 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019615011289715767 2022-03-16 18:42:15,191.191 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 18:42:15,191.191 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'qu', '##aint', 'kitchen', 'with', '[MASK]', 'flowers', 'on', 'a', 'small', 'central', 'table', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 18:42:15,207.207 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'kitchen', 'table', '[UNK]', 'refrigerator', 'chair', 'floor', 'cabinet', 'door', 'microwave', 'handle', 'rug', 'plate', 'ceiling', 'curtain', 'light', 'bowl', 'flower', 'bottle', 'oven', 'drawer', 'leg', 'window', 'sink', 'towel', 'room', 'shelf', 'stove', 'pot', 'vase', 'cloth', 'cup', 'plant', 'picture', 'dish', 'magnet', 'maker', 'design', 'coffee', 'cushion', 'clock', 'tray', 'blanket', 'mirror', 'stool', 'fruit', 'glass', 'top', 'dining', 'knob'] 2022-03-16 18:42:31,113.113 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'small', 'door', 'central', 'cup', 'board', 'floor', 'table', 'wall', 'chair', 'kitchen', 'bowl', 'handle', 'pink', 'clock', 'plate', 'cabinet', 'cutting', 'flower', 'sink', 'cloth', 'toy', 'towel', 'tile', 'rack', 'jar', 'stove', 'dresser', 'magnet', 'knob', 'oven', 'refrigerator', 'microwave', 'vase', 'rug'] 03-16 18:42:51.561 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 18:42:51.561 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 18:42:52.914 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 18:44:54,805.805 2829:trainer.py:487 do_train_dict(): eta: 17:10:07 iter: 29800 speed: 291.9 images/sec total_norm: 139.4920 (143.6648) loss: 144.2754 (146.0038) masked_loss: 1.6383 (1.6250) tag_loss: 142.6230 (144.3788) time: 1.4329 (1.7540) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4278 (1.7489) save_time: 8.8805 (21.7526) lr: 0.000055 max mem: 26307 2022-03-16 18:44:55,166.166 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6000000238418579 2022-03-16 18:44:55,170.170 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 107.9897689819336 2022-03-16 18:44:55,171.171 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.54795055325613 2022-03-16 18:45:10,660.660 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019662857055664062 2022-03-16 18:45:10,661.661 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 18:45:10,661.661 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'close', 'commons', 'of', 'a', '[MASK]', 'in', 'front', 'of', 'a', 'refrigerator', 'with', 'its', 'door', 'open', 'taxonomy', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 18:45:10,677.677 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['bottle', 'hand', 'boy', 'shirt', 'eye', 'hair', 'refrigerator', 'shelf', 'finger', 'nose', 'label', '[UNK]', 'apple', 'beer', 'door', 'wall', 'bag', 'face', 'head', 'person', 'wine', 'can', 'cap', 'glass', 'mouth', 'rack', 'ear', 'drawer', 'young', 'floor', 'cooler', 'lid', 'handle', 'container', 'fridge', 'bin', 'jar', 'ball', 'top', 'child', 'button', 'open', 'arm', 'sleeve', 'egg', 'front', 'thumb', 'kid', 'man', 'drink'] 2022-03-16 18:45:26,611.611 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['hand', 'open', 'door', 'front', 'close', 'hair', 'mouth', 'person', 'wall', 'boy', 'eye', 'shirt', 'label', 'finger', 'nose', 'wine', 'bag', 'beer', 'bottle', 'apple', 'shelf', 'lid', 'drawer', 'jar', 'refrigerator'] 2022-03-16 18:47:49,944.944 2829:trainer.py:487 do_train_dict(): eta: 17:07:26 iter: 29900 speed: 292.3 images/sec total_norm: 141.0909 (142.3764) loss: 146.9476 (147.9617) masked_loss: 1.5776 (1.6118) tag_loss: 144.7945 (146.3500) time: 1.4311 (1.7514) data: 0.0001 (0.0005) to_device: 0.0051 (0.0051) time_gpu: 1.4258 (1.7458) save_time: 8.8805 (21.7526) lr: 0.000055 max mem: 26307 2022-03-16 18:47:50,304.304 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-16 18:47:50,304.304 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 144.07025146484375 2022-03-16 18:47:50,305.305 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.55842314402263 2022-03-16 18:48:05,803.803 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019699733704328537 2022-03-16 18:48:05,804.804 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 18:48:05,804.804 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'red', 'fire', 'hydra', '##nt', 'leaking', 'water', 'all', 'over', 'a', 'street', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 18:48:05,819.819 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['ground', 'chain', 'fire', 'water', 'leaf', 'puddle', 'paint', '[UNK]', 'reflection', 'base', 'red', 'mud', 'grass', 'hole', 'top', 'wall', 'pole', 'moss', 'rock', 'branch', 'stick', 'shadow', 'cap', 'curb', 'light', 'tree', 'yellow', 'dirt', 'dirty', 'pond', 'bolt', 'trash', 'flower', 'next', 'pipe', 'rusty', 'sign', 'old', 'object', 'paper', 'trunk', 'metal', 'open', 'drain', 'building', 'road', 'area', 'post', 'graffiti', 'leg'] 2022-03-16 18:48:21,674.674 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'water', 'street', 'red', 'fire', 'ground', 'chain', 'grass', 'leaf', 'reflection', 'gravel', 'drain', 'hose', 'puddle'] 2022-03-16 18:50:45,483.483 2829:trainer.py:487 do_train_dict(): eta: 17:04:45 iter: 30000 speed: 291.7 images/sec total_norm: 138.8714 (141.9345) loss: 146.1889 (147.5258) masked_loss: 1.6131 (1.5767) tag_loss: 144.6801 (145.9491) time: 1.4339 (1.7554) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4285 (1.7503) save_time: 8.8805 (21.7526) lr: 0.000055 max mem: 26307 2022-03-16 18:50:45,485.485 2829:checkpoint.py:222 save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt 2022-03-16 18:50:54,698.698 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.65625 2022-03-16 18:50:54,698.698 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 159.35382080078125 2022-03-16 18:50:54,698.698 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.56384276076409 2022-03-16 18:51:10,417.417 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019727135077118874 2022-03-16 18:51:10,417.417 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 18:51:10,418.418 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'photo', '[MASK]', 'a', 'wooden', 'bench', 'underneath', 'a', 'tree', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 18:51:10,433.433 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'sky', 'bench', 'trunk', 'ground', 'park', 'road', 'grass', 'bus', 'leg', 'pole', 'car', 'sidewalk', 'person', 'street', 'building', '[UNK]', 'dirt', 'back', 'window', 'shadow', 'truck', 'woman', 'branch', 'light', 'man', 'tire', 'graffiti', 'next', 'lot', 'wall', 'wheel', 'sign', 'van', 'front', 'yellow', 'hat', 'couple', 'curb', 'large', 'side', 'dress', 'pavement', 'post', 'city', 'fence', 'line', 'top', 'flower', 'parking'] 2022-03-16 18:51:26,346.346 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'building', 'road', 'park', 'street', 'car', 'ground', 'post', 'tree', 'sky', 'bus', 'leg', 'truck', 'wooden', 'shadow', 'grass', 'photo', 'pole', 'bench', 'dirt', 'trunk'] 2022-03-16 18:53:49,144.144 2829:trainer.py:487 do_train_dict(): eta: 17:02:14 iter: 30100 speed: 278.8 images/sec total_norm: 142.2289 (143.9444) loss: 145.6458 (147.4455) masked_loss: 1.6090 (1.6459) tag_loss: 144.0788 (145.7996) time: 1.4321 (1.8366) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4270 (1.7431) save_time: 8.8421 (19.6009) lr: 0.000055 max mem: 26307 2022-03-16 18:53:49,509.509 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6111111044883728 2022-03-16 18:53:49,509.509 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 147.583740234375 2022-03-16 18:53:49,509.509 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.57026211315433 2022-03-16 18:54:05,157.157 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01978553831577301 2022-03-16 18:54:05,157.157 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 18:54:05,157.157 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'hum', '##mus', '[MASK]', 'pit', '##a', 'chips', 'on', 'a', 'plate', 'with', 'fa', '##la', '##fe', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 18:54:05,173.173 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['table', 'food', 'plate', 'egg', '[UNK]', 'meat', 'paper', 'sausage', 'bread', 'tomato', 'napkin', 'shadow', 'bowl', 'potato', 'sauce', 'cookie', 'cup', 'shirt', 'container', 'bag', 'floor', 'cheese', 'water', 'breakfast', 'vegetable', 'rice', 'butter', 'white', 'different', 'bottle', 'chicken', 'person', 'glass', 'phone', 'other', 'fork', 'beef', 'bun', 'mushroom', 'full', 'yellow', 'green', 'spoon', 'knife', 'dinner', 'spot', 'meal', 'side', 'chip', 'wall'] 2022-03-16 18:54:21,133.133 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'table', 'food', 'paper', 'plate', 'meat', 'bread', 'egg', 'sausage'] 2022-03-16 18:56:44,723.723 2829:trainer.py:487 do_train_dict(): eta: 16:59:33 iter: 30200 speed: 291.6 images/sec total_norm: 141.1012 (143.4288) loss: 146.5415 (147.2401) masked_loss: 1.5724 (1.5607) tag_loss: 144.9308 (145.6794) time: 1.4325 (1.7557) data: 0.0001 (0.0001) to_device: 0.0051 (0.0051) time_gpu: 1.4272 (1.7505) save_time: 8.8421 (19.6009) lr: 0.000055 max mem: 26307 2022-03-16 18:56:45,085.085 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5 2022-03-16 18:56:45,086.086 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 169.15689086914062 2022-03-16 18:56:45,086.086 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.56280526390957 2022-03-16 18:57:00,669.669 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019794346764683723 2022-03-16 18:57:00,669.669 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 18:57:00,670.670 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', '[MASK]', 'on', 'ski', '##s', 'stands', 'in', 'the', 'snow', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 18:57:00,685.685 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['jacket', '[UNK]', 'snow', 'ski', 'tree', 'pole', 'glove', 'ground', 'person', 'boot', 'sky', 'coat', 'head', 'helmet', 'skier', 'hand', 'hat', 'man', 'mountain', 'hill', 'woman', 'child', 'foot', 'boy', 'slope', 'snowy', 'girl', 'track', 'cloud', 'leg', 'arm', 'lift', 'face', 'building', 'background', 'backpack', 'branch', 'top', 'bush', 'poles', 'trunk', 'young', 'sign', 'skiing', 'hair', 'stripe', 'wire', 'hood', 'small', 'roof'] 2022-03-16 18:57:16,711.711 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'car', 'ground', 'person', 'arm', 'chair', 'tree', 'sign', 'sky', 'snow', 'lift', 'pole', 'jacket', 'ski', 'boot', 'helmet', 'glove', 'skier'] 2022-03-16 18:59:40,410.410 2829:trainer.py:487 do_train_dict(): eta: 16:56:53 iter: 30300 speed: 291.4 images/sec total_norm: 140.2482 (141.6203) loss: 145.6944 (147.3203) masked_loss: 1.6413 (1.6330) tag_loss: 144.1128 (145.6873) time: 1.4321 (1.7569) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4272 (1.7517) save_time: 8.8421 (19.6009) lr: 0.000054 max mem: 26307 2022-03-16 18:59:40,771.771 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6000000238418579 2022-03-16 18:59:40,771.771 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 139.28025817871094 2022-03-16 18:59:40,771.771 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.57024020897715 2022-03-16 18:59:56,489.489 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019815821200609207 2022-03-16 18:59:56,490.490 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 18:59:56,490.490 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'bunch', 'of', 'people', 'sits', 'valkyrie', 'a', 'table', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 18:59:56,505.505 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['table', 'wall', 'shirt', 'bottle', 'plate', 'man', 'person', 'woman', 'hair', 'pizza', 'picture', 'fork', 'head', '[UNK]', 'knife', 'glasses', 'group', 'watch', 'frame', 'girl', 'straw', 'face', 'mirror', 'water', 'restaurant', 'glass', 'food', 'hand', 'sunglasses', 'label', 'lid', 'napkin', 'camera', 'beer', 'can', 'cup', 'spoon', 'hat', 'chair', 'pitcher', 'top', 'cap', 'phone', 'juice', 'menu', 'box', 'light', 'salt', 'sign', 'necklace'] 2022-03-16 19:00:12,397.397 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'face', 'water', 'woman', 'hair', 'girl', 'person', 'table', 'wall', 'shirt', 'picture', 'drink', 'frame', 'plate', 'bottle', 'cap', 'glasses', 'bunch', 'fork', 'pizza', 'juice', 'lid', 'menu', 'sunglasses'] 2022-03-16 19:02:35,895.895 2829:trainer.py:487 do_train_dict(): eta: 16:54:11 iter: 30400 speed: 291.8 images/sec total_norm: 141.1617 (145.2466) loss: 148.5276 (149.5912) masked_loss: 1.5957 (1.5880) tag_loss: 146.8653 (148.0031) time: 1.4327 (1.7549) data: 0.0001 (0.0002) to_device: 0.0052 (0.0051) time_gpu: 1.4274 (1.7497) save_time: 8.8421 (19.6009) lr: 0.000054 max mem: 26307 2022-03-16 19:02:36,257.257 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5142857432365417 2022-03-16 19:02:36,257.257 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 134.93768310546875 2022-03-16 19:02:36,258.258 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.5809973919978 2022-03-16 19:02:51,894.894 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01982649601995945 2022-03-16 19:02:51,895.895 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 19:02:51,895.895 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'red', 'yellow', '[MASK]', 'blue', 'airplane', 'is', 'sitting', 'on', 'the', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 19:02:51,911.911 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['window', 'nose', 'airplane', 'building', 'ground', 'wheel', 'cockpit', 'door', 'engine', 'wing', 'tree', '[UNK]', 'background', 'sky', 'stripe', 'tail', 'windshield', 'line', 'front', 'shadow', 'bridge', 'airport', 'runway', 'tire', 'sign', 'walkway', 'vehicle', 'pole', 'blue', 'fence', 'city', 'large', 'landing', 'truck', 'cover', 'plane', 'bush', 'man', 'logo', 'cone', 'grass', 'railing', 'flag', 'light', 'orange', 'display', 'fuselage', 'cart', 'box', 'white'] 2022-03-16 19:03:07,809.809 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['line', 'building', 'door', 'front', 'red', 'ground', 'blue', 'bridge', 'cover', 'engine', 'window', 'wing', 'tree', 'box', 'sky', 'yellow', 'background', 'nose', 'truck', 'wheel', 'tail', 'trailer', 'runway', 'cart', 'tire', 'airplane', 'cockpit', 'luggage', 'stripe', 'windshield'] 2022-03-16 19:05:31,627.627 2829:trainer.py:487 do_train_dict(): eta: 16:51:30 iter: 30500 speed: 291.4 images/sec total_norm: 140.5835 (143.5303) loss: 147.1093 (147.3752) masked_loss: 1.4450 (1.4994) tag_loss: 145.9678 (145.8757) time: 1.4332 (1.7573) data: 0.0001 (0.0001) to_device: 0.0051 (0.0051) time_gpu: 1.4278 (1.7521) save_time: 8.8421 (19.6009) lr: 0.000054 max mem: 26307 2022-03-16 19:05:31,988.988 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6176470518112183 2022-03-16 19:05:31,988.988 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 124.09210205078125 2022-03-16 19:05:31,989.989 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.58494514265871 2022-03-16 19:05:47,645.645 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019816165789961815 2022-03-16 19:05:47,645.645 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 19:05:47,646.646 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'half', 'of', 'a', 'sandwich', '[MASK]', 'held', 'in', 'hand', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 19:05:47,661.661 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sandwich', 'meat', 'bread', 'hand', 'thumb', 'person', 'paper', 'finger', '[UNK]', 'table', 'half', 'chicken', 'bun', 'food', 'crust', 'shadow', 'bottom', 'onion', 'dog', 'background', 'wall', 'cheese', 'napkin', 'man', 'piece', 'nail', 'eaten', 'palm', 'cup', 'close', 'top', 'hot', 'large', 'sub', 'arm', 'basket', 'line', 'hamburger', 'cut', 'white', 'roll', 'shirt', 'wrist', 'sleeve', 'pepper', 'handle', 'chair', 'ground', 'tomato', 'beef'] 2022-03-16 19:06:03,492.492 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['hand', 'half', 'person', 'food', 'paper', 'bottom', 'finger', 'shadow', 'meat', 'thumb', 'bread', 'sandwich'] 2022-03-16 19:08:27,302.302 2829:trainer.py:487 do_train_dict(): eta: 16:48:49 iter: 30600 speed: 291.4 images/sec total_norm: 139.9355 (143.5255) loss: 144.2353 (145.9365) masked_loss: 1.5490 (1.5329) tag_loss: 142.9350 (144.4036) time: 1.4322 (1.7568) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4270 (1.7516) save_time: 8.8421 (19.6009) lr: 0.000054 max mem: 26307 2022-03-16 19:08:27,663.663 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6363636255264282 2022-03-16 19:08:27,663.663 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 153.99993896484375 2022-03-16 19:08:27,663.663 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.5893248144889 2022-03-16 19:08:43,351.351 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019861236214637756 2022-03-16 19:08:43,351.351 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 19:08:43,352.352 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'the', '[MASK]', 'photographs', '[MASK]', 'comparing', 'the', 'city', 'streets', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 19:08:43,367.367 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'sky', 'cloud', 'pole', 'road', 'building', 'street', 'car', 'window', 'person', 'sign', 'light', 'sidewalk', '[UNK]', 'tire', 'line', 'roof', 'house', 'ground', 'bench', 'wall', 'picture', 'wheel', 'bus', 'clock', 'photo', 'truck', 'fence', 'woman', 'door', 'letter', 'bush', 'man', 'statue', 'church', 'tower', 'traffic', 'shirt', 'post', 'white', 'van', 'monument', 'hat', 'front', 'grass', 'flag', 'cross', 'mountain', 'snow', 'shadow'] 2022-03-16 19:08:59,277.277 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'back', 'city', 'house', 'church', 'building', 'road', 'street', 'light', 'car', 'person', 'wall', 'window', 'tree', 'tower', 'sky', 'truck', 'clock', 'mirror', 'cloud', 'statue', 'pole', 'tire'] 2022-03-16 19:11:23,330.330 2829:trainer.py:487 do_train_dict(): eta: 16:46:08 iter: 30700 speed: 290.9 images/sec total_norm: 140.1153 (141.1037) loss: 148.4659 (148.7408) masked_loss: 1.5450 (1.5845) tag_loss: 147.0765 (147.1564) time: 1.4331 (1.7603) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4279 (1.7551) save_time: 8.8421 (19.6009) lr: 0.000054 max mem: 26307 2022-03-16 19:11:23,691.691 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6285714507102966 2022-03-16 19:11:23,691.691 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 158.290283203125 2022-03-16 19:11:23,691.691 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.5810612393664 2022-03-16 19:11:39,484.484 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019849155098199844 2022-03-16 19:11:39,485.485 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 19:11:39,485.485 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'lean', 'back', 'style', 'motorcycle', 'with', 'saddle', '##bags', 'upstairs', 'outside', 'a', 'building', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 19:11:39,501.501 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['motorcycle', 'tire', 'building', 'bike', 'car', 'sky', 'wheel', 'light', 'window', '[UNK]', 'street', 'road', 'line', 'mirror', 'seat', 'tree', 'door', 'sign', 'pole', 'suv', 'handle', 'lot', 'wall', 'pipe', 'engine', 'parking', 'sidewalk', 'van', 'curb', 'ground', 'roof', 'fender', 'cloud', 'gas', 'windshield', 'fence', 'truck', 'tank', 'man', 'next', 'flag', 'license', 'rim', 'plate', 'person', 'shirt', 'helmet', 'shadow', 'house', 'parked'] 2022-03-16 19:11:55,374.374 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'building', 'road', 'street', 'light', 'car', 'style', 'seat', 'lot', 'van', 'chair', 'window', 'tree', 'sign', 'sky', 'roof', 'bag', 'handle', 'wheel', 'mirror', 'parking', 'bike', 'pipe', 'motorcycle', 'tire', 'exhaust', 'suv', 'fender'] 03-16 19:12:53.013 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 19:12:53.013 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 19:12:54.226 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 88}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 19:14:19,149.149 2829:trainer.py:487 do_train_dict(): eta: 16:43:27 iter: 30800 speed: 291.2 images/sec total_norm: 138.5450 (140.4120) loss: 146.8929 (145.0909) masked_loss: 1.5597 (1.5646) tag_loss: 145.1104 (143.5263) time: 1.4335 (1.7582) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4282 (1.7531) save_time: 8.8421 (19.6009) lr: 0.000054 max mem: 26307 2022-03-16 19:14:19,510.510 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.47058823704719543 2022-03-16 19:14:19,510.510 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 131.0284423828125 2022-03-16 19:14:19,510.510 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.60162789304665 2022-03-16 19:14:35,509.509 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01989004947245121 2022-03-16 19:14:35,510.510 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 19:14:35,510.510 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'nun', 'sharing', '[MASK]', 'with', 'two', 'young', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 19:14:35,525.525 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hair', 'shirt', 'wall', 'hand', 'head', 'man', 'picture', 'face', 'nose', 'ear', 'person', 'woman', 'table', 'window', 'railing', '[UNK]', 'glass', 'boy', 'eye', 'mouth', 'room', 'arm', 'finger', 'phone', 'frame', 'glasses', 'ceiling', 'cup', 'food', 'chair', 'collar', 'light', 'plate', 'watch', 'girl', 'cell', 'neck', 'bottle', 'necklace', 'jacket', 'wrist', 'laptop', 'bowl', 'fork', 'cake', 'dress', 'tie', 'bracelet', 'logo', 'ring'] 2022-03-16 19:14:51,484.484 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'young', 'light', 'woman', 'cup', 'television', 'hair', 'person', 'table', 'wall', 'boy', 'bar', 'sign', 'shirt', 'picture', 'nose', 'ear', 'handle', 'plate', 'knife', 'blade', 'pan', 'glasses', 'pizza', 'tray', 'slice', 'railing', 'nun', 'crust'] 2022-03-16 19:17:15,137.137 2829:trainer.py:487 do_train_dict(): eta: 16:40:46 iter: 30900 speed: 290.9 images/sec total_norm: 142.5558 (145.7625) loss: 150.8870 (148.2350) masked_loss: 1.5747 (1.6199) tag_loss: 149.3488 (146.6151) time: 1.4326 (1.7598) data: 0.0001 (0.0002) to_device: 0.0052 (0.0051) time_gpu: 1.4273 (1.7546) save_time: 8.8421 (19.6009) lr: 0.000053 max mem: 26307 2022-03-16 19:17:15,497.497 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7272727489471436 2022-03-16 19:17:15,498.498 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 157.89096069335938 2022-03-16 19:17:15,498.498 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.6092076086229 2022-03-16 19:17:31,368.368 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01991117373108864 2022-03-16 19:17:31,368.368 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 19:17:31,369.369 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'empty', 'kitchen', 'with', 'white', 'cabinets', '[MASK]', 'black', 'counters', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 19:17:31,384.384 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['cabinet', 'kitchen', 'stove', 'wall', '[UNK]', 'floor', 'oven', 'drawer', 'door', 'top', 'knob', 'handle', 'table', 'outlet', 'window', 'ceiling', 'pan', 'sink', 'tile', 'light', 'hood', 'wood', 'pot', 'refrigerator', 'black', 'board', 'counter', 'white', 'pipe', 'towel', 'doorway', 'lid', 'kettle', 'wooden', 'cutting', 'island', 'shelf', 'picture', 'rug', 'rack', 'plate', 'shadow', 'leg', 'large', 'clock', 'vent', 'old', 'paper', 'switch', 'room'] 2022-03-16 19:17:47,420.420 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'black', 'white', 'top', 'door', 'light', 'floor', 'table', 'wall', 'window', 'kitchen', 'handle', 'cabinet', 'ceiling', 'pan', 'beam', 'sink', 'cord', 'drawer', 'outlet', 'stove', 'knob', 'oven'] 2022-03-16 19:20:10,928.928 2829:trainer.py:487 do_train_dict(): eta: 16:38:05 iter: 31000 speed: 291.3 images/sec total_norm: 139.5078 (142.2444) loss: 150.1836 (150.2809) masked_loss: 1.6066 (1.6128) tag_loss: 148.4965 (148.6682) time: 1.4323 (1.7580) data: 0.0001 (0.0005) to_device: 0.0051 (0.0051) time_gpu: 1.4270 (1.7524) save_time: 8.8421 (19.6009) lr: 0.000053 max mem: 26307 2022-03-16 19:20:11,289.289 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6363636255264282 2022-03-16 19:20:11,289.289 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 144.26344299316406 2022-03-16 19:20:11,289.289 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.60965793769076 2022-03-16 19:20:27,251.251 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01990804262459278 2022-03-16 19:20:27,252.252 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 19:20:27,252.252 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'banana', 'was', 'peeled', '[MASK]', 'had', 'a', '[MASK]', 'taken', 'out', 'of', 'it', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 19:20:27,268.268 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['banana', 'hand', 'thumb', 'person', 'peel', 'finger', 'background', 'nail', 'man', '[UNK]', 'shirt', 'stem', 'yellow', 'sleeve', 'wall', 'palm', 'ripe', 'window', 'flower', 'picture', 'ring', 'peeled', 'arm', 'bunch', 'logo', 'wrist', 'reflection', 'woman', 'face', 'handle', 'large', 'end', 'letter', 'orange', 'writing', 'half', 'paper', 'white', 'jacket', 'wire', 'object', 'spot', 'bananas', 'line', 'hole', 'close', 'top', 'hair', 'jean', 'label'] 2022-03-16 19:20:43,226.226 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['hand', 'end', 'person', 'finger', 'bite', 'thumb', 'stem', 'blanket', 'peel', 'banana'] 2022-03-16 19:23:07,019.019 2829:trainer.py:487 do_train_dict(): eta: 16:35:24 iter: 31100 speed: 290.8 images/sec total_norm: 140.4396 (141.3662) loss: 146.1037 (146.7042) masked_loss: 1.5495 (1.5990) tag_loss: 144.8077 (145.1052) time: 1.4335 (1.7609) data: 0.0001 (0.0001) to_device: 0.0051 (0.0051) time_gpu: 1.4282 (1.7556) save_time: 8.8421 (19.6009) lr: 0.000053 max mem: 26307 2022-03-16 19:23:07,380.380 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6666666865348816 2022-03-16 19:23:07,380.380 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 135.91114807128906 2022-03-16 19:23:07,380.380 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.61213259819226 2022-03-16 19:23:23,430.430 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019967107102274895 2022-03-16 19:23:23,430.430 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 19:23:23,431.431 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', '[MASK]', 'pie', 'and', 'a', 'fork', 'rest', 'on', 'a', 'yellow', 'plate', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 19:23:23,446.446 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['cake', 'plate', 'table', 'fork', 'handle', '[UNK]', 'piece', 'plastic', 'paper', 'spoon', 'bag', 'food', 'napkin', 'white', 'ice', 'dessert', 'light', 'reflection', 'object', 'knife', 'cream', 'layer', 'hole', 'top', 'bowl', 'shadow', 'cheese', 'line', 'blue', 'spot', 'eaten', 'floor', 'cup', 'close', 'slice', 'chocolate', 'next', 'bread', 'sauce', 'container', 'bottle', 'flower', 'half', 'pie', 'logo', 'hand', 'item', 'cloth', 'box', 'delicious'] 2022-03-16 19:23:39,383.383 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'light', 'rest', 'table', 'paper', 'yellow', 'handle', 'plate', 'bottle', 'fork', 'cake', 'pie', 'slice', 'spoon'] 2022-03-16 19:26:03,085.085 2829:trainer.py:487 do_train_dict(): eta: 16:32:42 iter: 31200 speed: 290.8 images/sec total_norm: 140.9551 (143.5243) loss: 146.7103 (146.4920) masked_loss: 1.5391 (1.5932) tag_loss: 144.9235 (144.8988) time: 1.4334 (1.7608) data: 0.0001 (0.0002) to_device: 0.0050 (0.0050) time_gpu: 1.4283 (1.7556) save_time: 8.8421 (19.6009) lr: 0.000053 max mem: 26307 2022-03-16 19:26:03,447.447 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5 2022-03-16 19:26:03,447.447 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 124.42505645751953 2022-03-16 19:26:03,447.447 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.62576716852645 2022-03-16 19:26:19,448.448 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.0199777539819479 2022-03-16 19:26:19,449.449 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 19:26:19,449.449 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'standing', 'with', 'a', '[MASK]', 'phone', 'strapped', 'to', 'his', 'ear', '.', 'hay', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 19:26:19,464.464 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hand', 'wall', 'shirt', 'man', 'lamp', 'door', 'face', 'hair', 'ceiling', 'nose', '[UNK]', 'eye', 'head', 'mouth', 'picture', 'cord', 'wire', 'shade', 'outlet', 'doorway', 'arm', 'ear', 'switch', 'glasses', 'light', 'book', 'frame', 'room', 'sleeve', 'box', 'beard', 'young', 'table', 'logo', 'jean', 'finger', 'cabinet', 'remote', 'phone', 'green', 'game', 'wii', 'tag', 'chin', 'floor', 'person', 'can', 'chair', 'couch', 'boy'] 2022-03-16 19:26:35,377.377 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'hand', 'face', 'band', 'book', 'door', 'hair', 'mouth', 'wall', 'arm', 'phone', 'eye', 'cell', 'shirt', 'teeth', 'nose', 'ear', 'chin', 'ceiling', 'switch', 'shade', 'sleeve', 'lamp', 'cord', 'outlet'] 2022-03-16 19:28:59,175.175 2829:trainer.py:487 do_train_dict(): eta: 16:30:01 iter: 31300 speed: 290.8 images/sec total_norm: 141.2020 (143.7188) loss: 145.8860 (147.5371) masked_loss: 1.6635 (1.6370) tag_loss: 144.0593 (145.9001) time: 1.4332 (1.7609) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4281 (1.7557) save_time: 8.8421 (19.6009) lr: 0.000053 max mem: 26307 2022-03-16 19:28:59,535.535 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6857143044471741 2022-03-16 19:28:59,536.536 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 172.20855712890625 2022-03-16 19:28:59,536.536 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.62044667286479 2022-03-16 19:29:15,639.639 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019985578954219818 2022-03-16 19:29:15,640.640 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 19:29:15,640.640 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'bright', '[MASK]', 'plane', 'contrasts', 'the', 'blue', 'sky', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 19:29:15,656.656 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'wing', 'tail', 'airplane', 'yellow', 'propeller', 'blue', 'engine', 'plane', '[UNK]', 'small', 'window', 'body', 'wheel', 'aircraft', 'clear', 'air', 'cockpit', 'stripe', 'nose', 'letter', 'white', 'number', 'blade', 'bottom', 'landing', 'large', 'high', 'day', 'gear', 'bright', 'helicopter', 'light', 'front', 'single', 'writing', 'red', 'logo', 'person', 'close', 'end', 'fin', 'side', 'gray', 'black', 'antenna', 'green', 'tree', 'star', 'old'] 2022-03-16 19:29:31,492.492 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['blue', 'wing', 'sky', 'yellow', 'bright', 'landing', 'plane', 'tail', 'gear', 'airplane', 'propeller'] 2022-03-16 19:31:55,377.377 2829:trainer.py:487 do_train_dict(): eta: 16:27:20 iter: 31400 speed: 290.6 images/sec total_norm: 142.8060 (146.0146) loss: 147.9489 (146.8488) masked_loss: 1.5580 (1.5676) tag_loss: 146.0753 (145.2812) time: 1.4327 (1.7620) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4276 (1.7569) save_time: 8.8421 (19.6009) lr: 0.000053 max mem: 26307 2022-03-16 19:31:55,738.738 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6000000238418579 2022-03-16 19:31:55,738.738 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 136.169677734375 2022-03-16 19:31:55,738.738 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.6303590441507 2022-03-16 19:32:11,893.893 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019961439073085785 2022-03-16 19:32:11,893.893 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 19:32:11,893.893 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'the', 'boys', 'lying', 'down', 'with', 'his', 'two', 'stuffed', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 19:32:11,909.909 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['eye', 'hair', 'nose', 'ear', 'face', 'mouth', 'bear', 'head', 'teddy', 'floor', 'foot', 'girl', 'child', 'arm', 'carpet', 'sweater', 'stuffed', 'animal', 'hand', 'leg', 'teeth', 'paw', 'ground', 'boy', 'little', 'baby', 'toy', '[UNK]', 'shirt', 'forehead', 'eyebrow', 'blanket', 'young', 'finger', 'towel', 'small', 'wall', 'toe', 'bow', 'rug', 'ball', 'shoe', 'sock', 'tail', 'next', 'kid', 'tie', 'bang', 'muzzle', 'photo'] 2022-03-16 19:32:27,938.938 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'hand', 'face', 'hair', 'mouth', 'floor', 'child', 'arm', 'boy', 'eye', 'foot', 'baby', 'shirt', 'teeth', 'animal', 'nose', 'ear', 'bear', 'eyebrow', 'carpet', 'teddy', 'stuffed', 'sweater', 'paw'] 2022-03-16 19:34:51,242.242 2829:trainer.py:487 do_train_dict(): eta: 16:24:38 iter: 31500 speed: 291.1 images/sec total_norm: 140.5874 (143.2908) loss: 145.6959 (144.5144) masked_loss: 1.5349 (1.5340) tag_loss: 144.1470 (142.9804) time: 1.4319 (1.7587) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4269 (1.7535) save_time: 8.8421 (19.6009) lr: 0.000053 max mem: 26307 2022-03-16 19:34:51,605.605 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6216216087341309 2022-03-16 19:34:51,605.605 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 134.2286834716797 2022-03-16 19:34:51,605.605 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.63051314293584 2022-03-16 19:35:07,903.903 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019935280084609985 2022-03-16 19:35:07,903.903 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 19:35:07,904.904 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'snow', '##board', '##er', 'sits', 'in', 'snow', 'as', 'another', 'charter', 'along', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 19:35:07,920.920 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'snow', 'jacket', 'man', 'ground', 'head', 'glove', 'boot', 'coat', 'hat', 'tree', 'leg', 'person', 'hand', 'arm', 'board', 'foot', 'tag', 'sky', 'face', 'patch', 'pole', 'hood', 'helmet', 'mountain', 'shoe', 'cap', 'fence', 'mouth', 'nose', 'ski', 'slope', 'woman', 'scarf', 'track', 'strap', 'hair', 'snowy', 'shadow', 'logo', 'cloud', 'hill', 'line', 'eye', 'flag', 'glasses', 'building', 'background', 'child', 'skier'] 2022-03-16 19:35:23,928.928 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'ground', 'person', 'foot', 'tree', 'sky', 'leg', 'snow', 'coat', 'pole', 'jacket', 'boot', 'shoe', 'backpack', 'strap', 'buckle'] 2022-03-16 19:37:47,363.363 2829:trainer.py:487 do_train_dict(): eta: 16:21:56 iter: 31600 speed: 290.7 images/sec total_norm: 139.4256 (140.1671) loss: 146.7512 (147.9309) masked_loss: 1.4832 (1.5031) tag_loss: 145.2636 (146.4278) time: 1.4319 (1.7612) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4268 (1.7561) save_time: 8.8421 (19.6009) lr: 0.000052 max mem: 26307 2022-03-16 19:37:47,725.725 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-16 19:37:47,725.725 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 171.8945770263672 2022-03-16 19:37:47,725.725 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.62982951504199 2022-03-16 19:38:04,079.079 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019937163218855858 2022-03-16 19:38:04,079.079 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 19:38:04,080.080 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'table', 'with', 'plates', 'and', 'containers', 'of', 'food', '[MASK]', 'electronics', '[MASK]', 'it', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 19:38:04,095.095 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['plate', 'corn', 'tomato', 'table', 'food', '[UNK]', 'bowl', 'meat', 'glass', 'onion', 'handle', 'phone', 'knife', 'spoon', 'person', 'container', 'vegetable', 'bag', 'wall', 'fork', 'napkin', 'potato', 'pen', 'bottle', 'cup', 'cell', 'box', 'hand', 'tray', 'cheese', 'sausage', 'towel', 'keyboard', 'shirt', 'carrot', 'mushroom', 'pan', 'lid', 'paper', 'chair', 'pepper', 'book', 'cabinet', 'stove', 'bread', 'banana', 'foil', 'butter', 'man', 'label'] 2022-03-16 19:38:20,061.061 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'hand', 'book', 'cup', 'design', 'person', 'table', 'food', 'phone', 'glass', 'box', 'cell', 'ring', 'finger', 'bag', 'bowl', 'plate', 'knife', 'pen', 'glasses', 'cheese', 'keyboard', 'fork', 'corn', 'cord', 'lid', 'butter', 'laptop', 'tomato'] 2022-03-16 19:40:43,802.802 2829:trainer.py:487 do_train_dict(): eta: 16:19:15 iter: 31700 speed: 290.2 images/sec total_norm: 142.4589 (143.7450) loss: 144.6931 (146.5456) masked_loss: 1.5818 (1.5985) tag_loss: 142.8194 (144.9471) time: 1.4335 (1.7643) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4282 (1.7591) save_time: 8.8421 (19.6009) lr: 0.000052 max mem: 26307 2022-03-16 19:40:44,163.163 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7941176295280457 2022-03-16 19:40:44,163.163 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 138.7493438720703 2022-03-16 19:40:44,164.164 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.63806295095 2022-03-16 19:41:00,448.448 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019949138164520264 2022-03-16 19:41:00,448.448 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 19:41:00,449.449 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'kids', '[MASK]', 'to', 'surf', 'in', 'the', 'ocean', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 19:41:00,464.464 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hair', 'water', 'arm', '[UNK]', 'hand', 'head', 'wave', 'shirt', 'logo', 'face', 'girl', 'boy', 'ocean', 'board', 'surfer', 'man', 'child', 'leg', 'suit', 'design', 'foot', 'sleeve', 'top', 'young', 'short', 'ear', 'person', 'wet', 'small', 'watch', 'surf', 'woman', 'star', 'glasses', 'name', 'reflection', 'trunk', 'mouth', 'nose', 'back', 'kid', 'beach', 'strap', 'body', 'large', 'boogie', 'bracelet', 'wrist', 'little', 'big'] 2022-03-16 19:41:16,439.439 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'water', 'board', 'hair', 'girl', 'design', 'arm', 'boy', 'shirt', 'ocean', 'leg', 'wave', 'logo', 'reflection'] 03-16 19:42:54.325 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 19:42:54.325 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 19:42:55.514 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 19:43:40,107.107 2829:trainer.py:487 do_train_dict(): eta: 16:16:33 iter: 31800 speed: 290.4 images/sec total_norm: 139.8025 (143.5358) loss: 142.4393 (143.7542) masked_loss: 1.5756 (1.6109) tag_loss: 140.8163 (142.1433) time: 1.4323 (1.7631) data: 0.0001 (0.0001) to_device: 0.0052 (0.0051) time_gpu: 1.4271 (1.7579) save_time: 8.8421 (19.6009) lr: 0.000052 max mem: 26307 2022-03-16 19:43:40,468.468 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.625 2022-03-16 19:43:40,468.468 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 132.703125 2022-03-16 19:43:40,468.468 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.65341087269559 2022-03-16 19:43:56,732.732 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.019941166043281555 2022-03-16 19:43:56,733.733 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 19:43:56,733.733 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', '##walk', 'signal', 'at', 'an', '[MASK]', 'with', 'a', 'car', 'and', 'a', 'bus', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 19:43:56,748.748 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['building', 'window', 'light', 'pole', 'sign', 'traffic', 'street', 'person', '[UNK]', 'man', 'wall', 'letter', 'woman', 'hand', 'shirt', 'door', 'store', 'arrow', 'balcony', 'hair', 'reflection', 'railing', 'signal', 'car', 'blind', 'bag', 'jean', 'tree', 'word', 'logo', 'sky', 'banner', 'front', 'wheel', 'stop', 'jacket', 'sidewalk', 'bike', 'post', 'box', 'coat', 'back', 'camera', 'flag', 'tire', 'shadow', 'bus', 'advertisement', 'head', 'purse'] 2022-03-16 19:44:12,590.590 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'building', 'street', 'light', 'woman', 'car', 'hair', 'person', 'window', 'store', 'sign', 'jean', 'shirt', 'bus', 'traffic', 'bag', 'signal', 'plate', 'wheel', 'coat', 'license', 'pole', 'jacket', 'intersection', 'bike', 'arrow', 'purse', 'reflection', 'bicycle', 'sidewalk', 'tire', 'advertisement', 'windshield'] 2022-03-16 19:46:36,316.316 2829:trainer.py:487 do_train_dict(): eta: 16:13:52 iter: 31900 speed: 290.6 images/sec total_norm: 140.7355 (143.6749) loss: 145.0065 (148.3272) masked_loss: 1.5698 (1.5752) tag_loss: 143.1826 (146.7520) time: 1.4311 (1.7621) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4263 (1.7570) save_time: 8.8421 (19.6009) lr: 0.000052 max mem: 26307 2022-03-16 19:46:36,676.676 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6060606241226196 2022-03-16 19:46:36,676.676 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 139.28988647460938 2022-03-16 19:46:36,676.676 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.65457969903946 2022-03-16 19:46:53,204.204 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.01998344622552395 2022-03-16 19:46:53,204.204 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 19:46:53,205.205 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'on', 'a', 'bike', 'with', '[MASK]', 'small', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 19:46:53,220.220 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'road', 'grass', 'bush', 'dog', 'head', '[UNK]', 'sidewalk', 'curb', 'leg', 'building', 'trash', 'ear', 'street', 'can', 'pole', 'house', 'roof', 'tire', 'window', 'sky', 'box', 'sign', 'bin', 'table', 'door', 'bench', 'trunk', 'line', 'car', 'wheel', 'light', 'spot', 'ground', 'tail', 'wall', 'flower', 'logo', 'face', 'lid', 'mirror', 'truck', 'shirt', 'man', 'plate', 'leaf', 'windshield', 'cow', 'post', 'person'] 2022-03-16 19:47:09,294.294 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'can', 'head', 'man', 'house', 'hand', 'face', 'small', 'air', 'building', 'road', 'street', 'short', 'rock', 'foot', 'window', 'step', 'tree', 'letter', 'shirt', 'dog', 'leg', 'ear', 'wheel', 'grass', 'bush', 'hat', 'bike', 'logo', 'trunk', 'fence', 'bicycle', 'shoe', 'cart', 'flip', 'sidewalk', 'tire', 'garbage', 'fender', 'stair', 'lettering', 'flop'] 2022-03-16 19:49:32,579.579 2829:trainer.py:487 do_train_dict(): eta: 16:11:10 iter: 32000 speed: 290.5 images/sec total_norm: 139.9445 (142.2715) loss: 144.7415 (145.7677) masked_loss: 1.4750 (1.5651) tag_loss: 143.1267 (144.2026) time: 1.4316 (1.7627) data: 0.0001 (0.0001) to_device: 0.0051 (0.0051) time_gpu: 1.4264 (1.7574) save_time: 8.8421 (19.6009) lr: 0.000052 max mem: 26307 2022-03-16 19:49:32,939.939 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5945945978164673 2022-03-16 19:49:32,939.939 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 164.02615356445312 2022-03-16 19:49:32,939.939 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.65701913090881 2022-03-16 19:49:49,318.318 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02000613324344158 2022-03-16 19:49:49,319.319 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 19:49:49,319.319 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'person', 'rides', '[MASK]', 'bicycle', ',', 'while', 'a', 'woman', 'holding', 'a', 'blend', '##er', 'on', '[MASK]', 'table', 'gasps', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 19:49:49,334.334 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['bike', 'shirt', 'bicycle', 'man', '[UNK]', 'woman', 'tire', 'sign', 'hat', 'ground', 'sidewalk', 'wheel', 'cap', 'table', 'shoe', 'top', 'head', 'person', 'hand', 'bottle', 'jean', 'tank', 'hair', 'building', 'short', 'arm', 'wall', 'paper', 'shadow', 'base', 'cup', 'window', 'sunglasses', 'seat', 'bag', 'leg', 'handle', 'tattoo', 'dress', 'skirt', 'jug', 'pedal', 'belt', 'pitcher', 'helmet', 'glasses', 'watch', 'liquid', 'pot', 'container'] 2022-03-16 19:50:05,241.241 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'top', 'woman', 'ground', 'person', 'table', 'wall', 'food', 'arm', 'window', 'sign', 'jean', 'shirt', 'dress', 'bag', 'tank', 'shadow', 'wheel', 'belt', 'hat', 'cap', 'liquid', 'bike', 'logo', 'bicycle', 'shoe', 'tattoo', 'sidewalk', 'tire', 'pedal', 'stripe', 'gasps'] 2022-03-16 19:52:28,911.911 2829:trainer.py:487 do_train_dict(): eta: 16:08:28 iter: 32100 speed: 290.4 images/sec total_norm: 139.4007 (143.9840) loss: 145.0429 (147.1746) masked_loss: 1.5359 (1.5990) tag_loss: 143.7283 (145.5756) time: 1.4315 (1.7633) data: 0.0001 (0.0005) to_device: 0.0052 (0.0051) time_gpu: 1.4263 (1.7577) save_time: 8.8421 (19.6009) lr: 0.000052 max mem: 26307 2022-03-16 19:52:29,273.273 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.38235294818878174 2022-03-16 19:52:29,274.274 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 141.93539428710938 2022-03-16 19:52:29,274.274 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.66980986269364 2022-03-16 19:52:45,860.860 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02003314532339573 2022-03-16 19:52:45,860.860 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 19:52:45,861.861 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'family', 'standing', 'around', '[MASK]', '[MASK]', 'go', 'down', 'the', 'ski', 'slope', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 19:52:45,876.876 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['person', 'snow', '[UNK]', 'helmet', 'ski', 'tree', 'jacket', 'ground', 'sky', 'glove', 'man', 'mountain', 'skier', 'tent', 'boot', 'group', 'pole', 'head', 'shirt', 'coat', 'hand', 'logo', 'background', 'hat', 'letter', 'arm', 'sign', 'strap', 'backpack', 'child', 'woman', 'kid', 'slope', 'flag', 'tag', 'cloud', 'canopy', 'number', 'girl', 'banner', 'hill', 'suit', 'hood', 'boy', 'vest', 'leg', 'stripe', 'top', 'track', 'sunglasses'] 2022-03-16 19:53:01,807.807 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'ground', 'person', 'child', 'boy', 'mountain', 'tree', 'letter', 'sky', 'background', 'snow', 'kid', 'coat', 'net', 'hat', 'pole', 'jacket', 'ski', 'boot', 'slope', 'helmet', 'shoe', 'glove', 'skier', 'sock'] 2022-03-16 19:55:25,356.356 2829:trainer.py:487 do_train_dict(): eta: 16:05:46 iter: 32200 speed: 290.2 images/sec total_norm: 143.2684 (145.3279) loss: 143.8137 (144.5256) masked_loss: 1.5552 (1.5678) tag_loss: 142.0320 (142.9578) time: 1.4312 (1.7644) data: 0.0001 (0.0001) to_device: 0.0051 (0.0051) time_gpu: 1.4260 (1.7592) save_time: 8.8421 (19.6009) lr: 0.000052 max mem: 26307 2022-03-16 19:55:25,719.719 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6666666865348816 2022-03-16 19:55:25,719.719 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 117.05340576171875 2022-03-16 19:55:25,719.719 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.67699672120274 2022-03-16 19:55:42,189.189 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020033935084939003 2022-03-16 19:55:42,189.189 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 19:55:42,190.190 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'dog', 'running', '[MASK]', 'sand', 'with', 'a', '[MASK]', '##is', '##bee', 'in', 'its', 'mouth', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 19:55:42,205.205 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['dog', 'sand', 'shadow', 'nose', 'water', '[UNK]', 'eye', 'beach', 'leg', 'head', 'ear', 'tail', 'arrow', 'mouth', 'sky', 'hair', 'paw', 'ocean', 'face', 'blue', 'footprint', 'foot', 'brown', 'design', 'small', 'cloud', 'white', 'ring', 'next', 'star', 'ground', 'top', 'wave', 'sandy', 'leaf', 'writing', 'black', 'arm', 'man', 'rock', 'picture', 'mountain', 'close', 'large', 'disc', 'cute', 'couple', 'short', 'hand', 'green'] 2022-03-16 19:55:58,122.122 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'face', 'water', 'mouth', 'eye', 'beach', 'sky', 'dog', 'leg', 'nose', 'ear', 'shadow', 'sand', 'arrow', 'footprint'] 2022-03-16 19:58:21,748.748 2829:trainer.py:487 do_train_dict(): eta: 16:03:04 iter: 32300 speed: 290.3 images/sec total_norm: 141.1412 (144.2714) loss: 146.8610 (146.6246) masked_loss: 1.4923 (1.5456) tag_loss: 145.3705 (145.0791) time: 1.4325 (1.7639) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4274 (1.7588) save_time: 8.8421 (19.6009) lr: 0.000051 max mem: 26307 2022-03-16 19:58:22,110.110 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.59375 2022-03-16 19:58:22,110.110 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 129.6722412109375 2022-03-16 19:58:22,110.110 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.6906299473327 2022-03-16 19:58:38,797.797 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020054377615451813 2022-03-16 19:58:38,798.798 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 19:58:38,798.798 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'green', 'train', 'traveling', 'down', 'rail', 'road', 'tracks', '[MASK]', 'to', 'a', 'forest', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 19:58:38,814.814 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'sky', 'track', 'train', 'window', 'grass', 'pole', 'road', 'cloud', 'front', 'wire', 'fence', 'door', 'ground', 'car', 'path', 'building', 'light', '[UNK]', 'gravel', 'green', 'sign', 'power', 'person', 'roof', 'logo', 'passenger', 'wheel', 'line', 'windshield', 'railroad', 'post', 'stripe', 'tower', 'telephone', 'bush', 'next', 'yellow', 'bumper', 'house', 'engine', 'long', 'flower', 'blue', 'top', 'trunk', 'wall', 'street', 'platform', 'hill'] 2022-03-16 19:58:54,764.764 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['house', 'building', 'top', 'road', 'front', 'light', 'ground', 'track', 'green', 'forest', 'window', 'train', 'tree', 'sky', 'path', 'rail', 'grass', 'bush', 'cloud', 'pole', 'flower'] 2022-03-16 20:01:18,269.269 2829:trainer.py:487 do_train_dict(): eta: 16:00:22 iter: 32400 speed: 290.1 images/sec total_norm: 142.1349 (145.6806) loss: 145.1860 (146.0428) masked_loss: 1.5131 (1.5705) tag_loss: 143.5649 (144.4722) time: 1.4324 (1.7652) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4271 (1.7601) save_time: 8.8421 (19.6009) lr: 0.000051 max mem: 26307 2022-03-16 20:01:18,629.629 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5757575631141663 2022-03-16 20:01:18,629.629 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 156.820068359375 2022-03-16 20:01:18,629.629 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.69842272244966 2022-03-16 20:01:35,402.402 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020048072561621666 2022-03-16 20:01:35,403.403 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 20:01:35,403.403 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'man', 'kite', '##boarding', 'over', 'the', 'ocean', '[MASK]', 'to', 'shore', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 20:01:35,419.419 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'water', 'cloud', 'building', '[UNK]', 'wave', 'person', 'tree', 'man', 'hill', 'shore', 'house', 'rock', 'beach', 'head', 'arm', 'mountain', 'short', 'hair', 'ocean', 'shirt', 'leg', 'sand', 'board', 'kite', 'boat', 'background', 'suit', 'boy', 'dog', 'hand', 'body', 'woman', 'hat', 'foot', 'top', 'jacket', 'bird', 'island', 'umbrella', 'city', 'tower', 'grass', 'wall', 'surfer', 'tail', 'bush', 'rope', 'shoe', 'ear'] 2022-03-16 20:01:51,299.299 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'city', 'man', 'house', 'next', 'water', 'building', 'person', 'tree', 'sky', 'ocean', 'wave', 'shore', 'cloud', 'reflection', 'kite'] 2022-03-16 20:04:14,808.808 2829:trainer.py:487 do_train_dict(): eta: 15:57:40 iter: 32500 speed: 290.0 images/sec total_norm: 140.3250 (145.0723) loss: 146.5620 (146.6340) masked_loss: 1.5365 (1.5268) tag_loss: 144.7990 (145.1073) time: 1.4317 (1.7654) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4265 (1.7602) save_time: 8.8421 (19.6009) lr: 0.000051 max mem: 26307 2022-03-16 20:04:15,169.169 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-16 20:04:15,169.169 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 141.58038330078125 2022-03-16 20:04:15,169.169 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.70738759655163 2022-03-16 20:04:31,791.791 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02004699409008026 2022-03-16 20:04:31,792.792 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 20:04:31,792.792 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'bearded', ',', '[MASK]', 'man', 'wearing', 'a', 'neck', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 20:04:31,807.807 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['eye', 'nose', 'ear', 'beard', 'hair', 'wall', 'man', 'face', 'hand', 'tie', 'mouth', 'neck', '[UNK]', 'shirt', 'collar', 'finger', 'arm', 'lip', 'mustache', 'shadow', 'button', 'logo', 'facial', 'knot', 'camera', 'chin', 'wrist', 'eyebrow', 'head', 'shoulder', 'ring', 'tattoo', 'stripe', 'sleeve', 'vest', 'strap', 'forehead', 'bow', 'design', 'dress', 'close', 'front', 'bearded', 'young', 'red', 'white', 'pocket', 'black', 'suit', 'thumb'] 2022-03-16 20:04:47,722.722 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'hand', 'face', 'hair', 'mouth', 'star', 'wall', 'arm', 'eye', 'neck', 'ring', 'finger', 'nose', 'ear', 'lip', 'tie', 'naked', 'flower', 'beard', 'knot'] 2022-03-16 20:07:11,589.589 2829:trainer.py:487 do_train_dict(): eta: 15:54:59 iter: 32600 speed: 289.6 images/sec total_norm: 145.5555 (148.4005) loss: 147.1146 (147.5797) masked_loss: 1.5570 (1.5840) tag_loss: 145.3728 (145.9957) time: 1.4322 (1.7678) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4271 (1.7626) save_time: 8.8421 (19.6009) lr: 0.000051 max mem: 26307 2022-03-16 20:07:11,950.950 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6969696879386902 2022-03-16 20:07:11,950.950 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 119.18550109863281 2022-03-16 20:07:11,950.950 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.71826982643991 2022-03-16 20:07:28,596.596 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020040763542056084 2022-03-16 20:07:28,597.597 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 20:07:28,597.597 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'black', 'cat', 'making', 'an', 'angry', 'face', 'while', '[MASK]', 'on', 'the', '[MASK]', 'floor', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 20:07:28,612.612 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['cat', 'ear', 'head', 'eye', 'wall', 'nose', 'tile', '[UNK]', 'face', 'black', 'paw', 'floor', 'leg', 'tail', 'rug', 'bathroom', 'carpet', 'mat', 'sink', 'foot', 'bag', 'door', 'towel', 'handle', 'lid', 'white', 'top', 'ground', 'table', 'animal', 'cloth', 'bed', 'cord', 'body', 'tub', 'tag', 'mouth', 'paper', 'bottle', 'box', 'next', 'knob', 'collar', 'container', 'pillow', 'toy', 'bowl', 'curtain', 'shelf', 'book'] 2022-03-16 20:07:44,576.576 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'face', 'black', 'ground', 'floor', 'wall', 'eye', 'shirt', 'leg', 'clothes', 'nose', 'ear', 'angry', 'cat', 'bathroom', 'tail', 'clothing', 'towel', 'tile', 'rug', 'paw'] 2022-03-16 20:10:08,098.098 2829:trainer.py:487 do_train_dict(): eta: 15:52:16 iter: 32700 speed: 290.1 images/sec total_norm: 143.1995 (146.2758) loss: 144.7127 (146.0520) masked_loss: 1.5765 (1.6242) tag_loss: 142.6012 (144.4278) time: 1.4312 (1.7651) data: 0.0001 (0.0001) to_device: 0.0051 (0.0051) time_gpu: 1.4259 (1.7599) save_time: 8.8421 (19.6009) lr: 0.000051 max mem: 26307 2022-03-16 20:10:08,459.459 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6060606241226196 2022-03-16 20:10:08,459.459 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 155.48599243164062 2022-03-16 20:10:08,459.459 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.71352985428601 2022-03-16 20:10:25,435.435 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02006969042122364 2022-03-16 20:10:25,435.435 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 20:10:25,435.435 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'round', '[MASK]', 'disc', 'lying', 'unidentified', 'rippled', 'water', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 20:10:25,451.451 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['water', '[UNK]', 'reflection', 'tail', 'head', 'light', 'ripple', 'object', 'wing', 'leg', 'bird', 'leaf', 'river', 'neck', 'feather', 'beak', 'shadow', 'body', 'red', 'foot', 'duck', 'ground', 'ball', 'black', 'branch', 'wave', 'back', 'plant', 'pole', 'grass', 'small', 'top', 'hand', 'face', 'handle', 'base', 'stripe', 'arm', 'post', 'white', 'dog', 'dock', 'rope', 'boat', 'yellow', 'line', 'paw', 'lake', 'wet', 'next'] 2022-03-16 20:10:41,367.367 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'water', 'red', 'round', 'disc', 'rippled'] 03-16 20:12:55.613 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 20:12:55.613 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 20:12:56.953 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 20:13:05,050.050 2829:trainer.py:487 do_train_dict(): eta: 15:49:35 iter: 32800 speed: 289.3 images/sec total_norm: 142.1701 (144.3814) loss: 145.3614 (147.2254) masked_loss: 1.4899 (1.5801) tag_loss: 143.8517 (145.6453) time: 1.4326 (1.7695) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4274 (1.7643) save_time: 8.8421 (19.6009) lr: 0.000051 max mem: 26307 2022-03-16 20:13:05,411.411 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5142857432365417 2022-03-16 20:13:05,411.411 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 170.00006103515625 2022-03-16 20:13:05,411.411 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.70257175294824 2022-03-16 20:13:22,266.266 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02009853534400463 2022-03-16 20:13:22,266.266 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 20:13:22,267.267 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'woman', 'sitting', '[MASK]', '[MASK]', 'park', 'bench', 'next', 'to', 'a', 'reflecting', 'pool', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 20:13:22,282.282 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'bench', 'flower', 'woman', 'hair', 'shoe', 'purse', 'tree', 'leg', 'bag', 'glasses', 'person', 'short', '[UNK]', 'water', 'bottle', 'bush', 'hand', 'skirt', 'lady', 'grass', 'head', 'plant', 'man', 'park', 'sidewalk', 'ground', 'fern', 'step', 'platform', 'front', 'couple', 'watch', 'arm', 'seat', 'dress', 'blouse', 'girl', 'sunglasses', 'wooden', 'slab', 'next', 'trunk', 'face', 'garden', 'top', 'floor', 'can', 'leaf', 'phone'] 2022-03-16 20:13:38,339.339 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'water', 'park', 'woman', 'short', 'hair', 'lady', 'plant', 'tree', 'shirt', 'leg', 'bag', 'pool', 'grass', 'bush', 'bottle', 'flower', 'bench', 'glasses', 'purse', 'skirt', 'shoe'] 2022-03-16 20:16:01,951.951 2829:trainer.py:487 do_train_dict(): eta: 15:46:53 iter: 32900 speed: 289.4 images/sec total_norm: 141.7698 (143.5504) loss: 141.2737 (143.8509) masked_loss: 1.5296 (1.5837) tag_loss: 139.8023 (142.2672) time: 1.4319 (1.7691) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4266 (1.7639) save_time: 8.8421 (19.6009) lr: 0.000050 max mem: 26307 2022-03-16 20:16:02,312.312 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.529411792755127 2022-03-16 20:16:02,313.313 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 134.08148193359375 2022-03-16 20:16:02,313.313 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.70526480241256 2022-03-16 20:16:19,071.071 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020088929682970047 2022-03-16 20:16:19,071.071 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 20:16:19,072.072 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'bunch', 'of', '[MASK]', 'are', 'sitting', '[MASK]', 'top', 'of', 'orange', '##s', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 20:16:19,087.087 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['banana', 'fruit', 'spot', 'stem', 'orange', 'line', 'table', 'end', 'ripe', '[UNK]', 'top', 'bowl', 'leaf', 'apple', 'dot', 'bananas', 'background', 'bunch', 'plate', 'close', 'wall', 'skin', 'bottom', 'hole', 'citrus', 'face', 'light', 'peel', 'basket', 'other', 'peeled', 'next', 'design', 'piece', 'green', 'shadow', 'pile', 'writing', 'lemon', 'large', 'yellow', 'picture', 'reflection', 'nose', 'rim', 'eye', 'object', 'full', 'blue', 'different'] 2022-03-16 20:16:34,987.987 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'end', 'line', 'top', 'table', 'writing', 'spot', 'orange', 'hole', 'fruit', 'stem', 'bunch', 'dot', 'banana'] 2022-03-16 20:18:58,862.862 2829:trainer.py:487 do_train_dict(): eta: 15:44:11 iter: 33000 speed: 289.4 images/sec total_norm: 141.5494 (144.7813) loss: 144.2122 (144.1205) masked_loss: 1.5725 (1.5776) tag_loss: 142.5147 (142.5429) time: 1.4320 (1.7691) data: 0.0001 (0.0001) to_device: 0.0052 (0.0051) time_gpu: 1.4268 (1.7638) save_time: 8.8421 (19.6009) lr: 0.000050 max mem: 26307 2022-03-16 20:18:59,223.223 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6111111044883728 2022-03-16 20:18:59,223.223 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 153.92584228515625 2022-03-16 20:18:59,223.223 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.69780951324186 2022-03-16 20:19:16,162.162 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020115673542022705 2022-03-16 20:19:16,162.162 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 20:19:16,162.162 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'dog', 'laying', 'on', 'a', 'bed', 'next', 'to', 'a', 'cat', 'cellar', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 20:19:16,177.177 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['ear', 'head', 'leg', 'nose', 'eye', 'dog', 'floor', 'blanket', 'tail', 'table', 'book', 'tag', 'face', 'cat', 'collar', 'towel', 'bed', 'paw', 'cord', 'wall', 'mouth', '[UNK]', 'shelf', 'box', 'pillow', 'basket', 'stripe', 'suitcase', 'chair', 'top', 'wire', 'couch', 'carpet', 'furniture', 'cabinet', 'magazine', 'outlet', 'dvd', 'door', 'next', 'hair', 'paper', 'large', 'brown', 'bag', 'rug', 'black', 'white', 'bowl', 'small'] 2022-03-16 20:19:32,106.106 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'face', 'large', 'book', 'floor', 'bed', 'table', 'wall', 'magazine', 'eye', 'box', 'dog', 'leg', 'nose', 'ear', 'cat', 'tail', 'tag', 'leaf', 'blanket', 'curtain', 'cord', 'outlet'] 2022-03-16 20:21:55,771.771 2829:trainer.py:487 do_train_dict(): eta: 15:41:29 iter: 33100 speed: 289.4 images/sec total_norm: 141.7772 (145.0206) loss: 144.6963 (144.3815) masked_loss: 1.5534 (1.5701) tag_loss: 143.4065 (142.8114) time: 1.4317 (1.7691) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4265 (1.7639) save_time: 8.8421 (19.6009) lr: 0.000050 max mem: 26307 2022-03-16 20:21:56,131.131 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4571428596973419 2022-03-16 20:21:56,131.131 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 163.4749298095703 2022-03-16 20:21:56,132.132 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.68700901858777 2022-03-16 20:22:13,001.001 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020109426230192184 2022-03-16 20:22:13,001.001 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 20:22:13,002.002 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'person', 'riding', '[MASK]', '##s', '[MASK]', 'a', 'snowy', 'slope', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 20:22:13,017.017 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['glove', 'snow', '[UNK]', 'pole', 'ski', 'ground', 'person', 'jacket', 'skier', 'helmet', 'head', 'hand', 'boot', 'man', 'shirt', 'face', 'slope', 'hill', 'shadow', 'mountain', 'sky', 'cloud', 'hat', 'arm', 'boy', 'tree', 'woman', 'coat', 'snowy', 'foot', 'scarf', 'poles', 'fence', 'leg', 'writing', 'track', 'field', 'hair', 'downhill', 'letter', 'vest', 'sign', 'flag', 'shoe', 'logo', 'line', 'hood', 'skiing', 'trail', 'rock'] 2022-03-16 20:22:29,011.011 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'ground', 'hair', 'person', 'structure', 'sign', 'sky', 'shirt', 'snow', 'cloud', 'pole', 'jacket', 'ski', 'fence', 'boot', 'slope', 'helmet', 'glove', 'snowy', 'skier'] 2022-03-16 20:24:52,826.826 2829:trainer.py:487 do_train_dict(): eta: 15:38:46 iter: 33200 speed: 289.2 images/sec total_norm: 143.1122 (145.4719) loss: 144.1298 (145.8069) masked_loss: 1.5737 (1.6303) tag_loss: 142.2640 (144.1767) time: 1.4316 (1.7706) data: 0.0001 (0.0005) to_device: 0.0051 (0.0050) time_gpu: 1.4264 (1.7651) save_time: 8.8421 (19.6009) lr: 0.000050 max mem: 26307 2022-03-16 20:24:53,188.188 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6470588445663452 2022-03-16 20:24:53,188.188 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 110.79651641845703 2022-03-16 20:24:53,189.189 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.70362829947257 2022-03-16 20:25:10,169.169 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020150184631347656 2022-03-16 20:25:10,169.169 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 20:25:10,169.169 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'putting', 'a', '[MASK]', 'cup', 'in', 'a', 'microwave', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 20:25:10,185.185 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hair', 'hand', 'nose', 'eye', 'wall', 'shirt', 'head', 'cup', 'face', 'ear', 'man', '[UNK]', 'strawberry', 'ceiling', 'person', 'sweater', 'cord', 'microwave', 'finger', 'door', 'light', 'handle', 'mug', 'window', 'design', 'clock', 'thumb', 'shelf', 'bowl', 'kitchen', 'mustache', 'heart', 'tile', 'cabinet', 'flower', 'picture', 'glass', 'plate', 'leaf', 'food', 'woman', 'pot', 'outlet', 'front', 'oven', 'container', 'refrigerator', 'beard', 'knob', 'eyebrow'] 2022-03-16 20:25:26,134.134 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'hand', 'face', 'door', 'light', 'cup', 'heart', 'hair', 'design', 'person', 'wall', 'eye', 'window', 'shirt', 'coffee', 'finger', 'nose', 'ear', 'cabinet', 'ceiling', 'thumb', 'reflection', 'sweater', 'fixture', 'strawberry', 'microwave', 'mustache'] 2022-03-16 20:27:49,550.550 2829:trainer.py:487 do_train_dict(): eta: 15:36:04 iter: 33300 speed: 289.7 images/sec total_norm: 140.5572 (143.3378) loss: 144.4606 (145.6001) masked_loss: 1.6290 (1.6212) tag_loss: 143.2790 (143.9788) time: 1.4318 (1.7672) data: 0.0001 (0.0002) to_device: 0.0052 (0.0050) time_gpu: 1.4266 (1.7620) save_time: 8.8421 (19.6009) lr: 0.000050 max mem: 26307 2022-03-16 20:27:49,911.911 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6571428775787354 2022-03-16 20:27:49,911.911 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 132.1900634765625 2022-03-16 20:27:49,911.911 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.70051733462397 2022-03-16 20:28:07,065.065 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02015579864382744 2022-03-16 20:28:07,066.066 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 20:28:07,066.066 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'man', 'standing', 'outside', '[MASK]', 'business', 'using', 'cell', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 20:28:07,082.082 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sign', 'jacket', 'building', 'hand', 'hair', 'door', 'head', 'wall', 'window', 'man', 'bike', 'face', 'letter', 'arm', 'bicycle', '[UNK]', 'coat', 'phone', 'ear', 'store', 'mouth', 'handle', 'glasses', 'bottle', 'graffiti', 'collar', 'cell', 'writing', 'shirt', 'pole', 'glass', 'front', 'sidewalk', 'number', 'next', 'jean', 'stop', 'bag', 'old', 'motorcycle', 'chain', 'basket', 'pipe', 'street', 'camera', 'nose', 'reflection', 'black', 'vent', 'shop'] 2022-03-16 20:28:23,043.043 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'building', 'door', 'business', 'hair', 'outside', 'wall', 'arm', 'phone', 'window', 'cell', 'store', 'letter', 'sign', 'ear', 'chain', 'handle', 'wheel', 'pole', 'jacket', 'bike', 'bicycle', 'tire', 'poster', 'graffiti'] 2022-03-16 20:30:46,608.608 2829:trainer.py:487 do_train_dict(): eta: 15:33:22 iter: 33400 speed: 289.2 images/sec total_norm: 141.0561 (145.0242) loss: 144.2563 (145.0876) masked_loss: 1.5312 (1.5587) tag_loss: 142.6852 (143.5288) time: 1.4335 (1.7705) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4284 (1.7653) save_time: 8.8421 (19.6009) lr: 0.000050 max mem: 26307 2022-03-16 20:30:46,970.970 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5 2022-03-16 20:30:46,970.970 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 126.08206939697266 2022-03-16 20:30:46,970.970 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.71401871638511 2022-03-16 20:31:04,083.083 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02022990956902504 2022-03-16 20:31:04,084.084 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 20:31:04,084.084 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'plan', 'in', '[MASK]', '[MASK]', 'with', 'cord', 'attached', 'and', 'stairs', 'attached', 'to', 'open', 'door', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 20:31:04,100.100 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['airplane', 'window', 'wing', 'engine', 'cockpit', 'ground', 'stair', 'sky', 'floor', '[UNK]', 'nose', 'wheel', 'door', 'tail', 'logo', 'building', 'airport', 'staircase', 'front', 'light', 'cone', 'line', 'plane', 'wall', 'windshield', 'person', 'cart', 'jet', 'vehicle', 'ladder', 'platform', 'step', 'tire', 'car', 'large', 'ceiling', 'terminal', 'landing', 'man', 'gear', 'truck', 'letter', 'number', 'shirt', 'walkway', 'stripe', 'tunnel', 'railing', 'ramp', 'shadow'] 2022-03-16 20:31:20,036.036 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'line', 'body', 'door', 'front', 'ground', 'person', 'engine', 'airport', 'window', 'wing', 'sky', 'roof', 'nose', 'wheel', 'tail', 'ceiling', 'logo', 'ladder', 'cord', 'airplane', 'cockpit', 'propeller', 'windshield', 'stair'] 2022-03-16 20:33:43,841.841 2829:trainer.py:487 do_train_dict(): eta: 15:30:40 iter: 33500 speed: 288.9 images/sec total_norm: 141.8338 (145.9614) loss: 143.0138 (144.8058) masked_loss: 1.4632 (1.5366) tag_loss: 141.7024 (143.2691) time: 1.4325 (1.7724) data: 0.0001 (0.0001) to_device: 0.0052 (0.0051) time_gpu: 1.4273 (1.7671) save_time: 8.8421 (19.6009) lr: 0.000050 max mem: 26307 2022-03-16 20:33:44,205.205 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6060606241226196 2022-03-16 20:33:44,205.205 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 155.57501220703125 2022-03-16 20:33:44,206.206 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.7188180628277 2022-03-16 20:34:01,404.404 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020215952768921852 2022-03-16 20:34:01,404.404 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 20:34:01,404.404 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'vintage', '[MASK]', 'style', 'clock', 'is', 'on', 'an', 'outdoor', 'pot', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 20:34:01,420.420 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'pole', 'building', 'cloud', 'grass', 'light', 'road', 'street', 'sign', 'bush', 'tree', 'window', 'car', 'truck', 'city', 'tower', 'graffiti', 'ground', '[UNK]', 'line', 'traffic', 'roof', 'bench', 'bridge', 'cloudy', 'logo', 'door', 'fence', 'sand', 'trailer', 'stop', 'windshield', 'telephone', 'person', 'bus', 'post', 'water', 'hill', 'side', 'distance', 'clock', 'beach', 'large', 'empty', 'red', 'sidewalk', 'antenna', 'cross', 'next', 'wall'] 2022-03-16 20:34:17,348.348 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'can', 'hand', 'building', 'road', 'street', 'light', 'ground', 'wall', 'bridge', 'window', 'tree', 'tower', 'sky', 'roof', 'snow', 'clock', 'grass', 'bush', 'cloud', 'pole', 'bench', 'outdoor', 'barrel', 'fence', 'pot', 'ladder', 'crane', 'trash', 'graffiti'] 2022-03-16 20:36:40,825.825 2829:trainer.py:487 do_train_dict(): eta: 15:27:57 iter: 33600 speed: 289.3 images/sec total_norm: 141.5611 (145.0671) loss: 141.5509 (143.5653) masked_loss: 1.5555 (1.5376) tag_loss: 139.6334 (142.0277) time: 1.4310 (1.7699) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4258 (1.7647) save_time: 8.8421 (19.6009) lr: 0.000049 max mem: 26307 2022-03-16 20:36:41,187.187 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5151515007019043 2022-03-16 20:36:41,187.187 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 136.50167846679688 2022-03-16 20:36:41,187.187 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.72325416134798 2022-03-16 20:36:58,328.328 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020254503935575485 2022-03-16 20:36:58,328.328 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 20:36:58,328.328 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'sandwich', 'with', 'chocolate', 'spread', 'arranged', 'on', '[MASK]', 'white', 'plate', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 20:36:58,344.344 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['plate', 'table', 'cake', 'sandwich', 'cheese', 'chocolate', 'sauce', 'meat', 'food', 'bread', 'dessert', '[UNK]', 'handle', 'cream', 'ice', 'white', 'napkin', 'layer', 'crust', 'top', 'container', 'bowl', 'fork', 'piece', 'wall', 'design', 'background', 'steak', 'bean', 'butter', 'hand', 'egg', 'close', 'light', 'stripe', 'half', 'spoon', 'label', 'glass', 'bun', 'finger', 'shadow', 'bottle', 'stain', 'pie', 'hole', 'stem', 'object', 'eaten', 'paper'] 2022-03-16 20:37:14,243.243 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['white', 'top', 'table', 'food', 'spread', 'letter', 'label', 'background', 'plate', 'shadow', 'apple', 'bread', 'stem', 'chocolate', 'logo', 'cheese', 'cake', 'sandwich', 'lid', 'sauce', 'banana', 'jar', 'dessert'] 2022-03-16 20:39:38,087.087 2829:trainer.py:487 do_train_dict(): eta: 15:25:15 iter: 33700 speed: 288.8 images/sec total_norm: 140.8120 (144.1257) loss: 143.2858 (145.2971) masked_loss: 1.5024 (1.5549) tag_loss: 142.1136 (143.7421) time: 1.4321 (1.7726) data: 0.0001 (0.0001) to_device: 0.0051 (0.0051) time_gpu: 1.4266 (1.7673) save_time: 8.8421 (19.6009) lr: 0.000049 max mem: 26307 2022-03-16 20:39:38,449.449 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4848484992980957 2022-03-16 20:39:38,449.449 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 146.9434051513672 2022-03-16 20:39:38,450.450 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.7336538529255 2022-03-16 20:39:55,734.734 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020248694345355034 2022-03-16 20:39:55,734.734 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 20:39:55,735.735 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'small', 'child', 'is', 'sitting', 'on', 'a', 'toilet', 'with', '[MASK]', '[MASK]', 'device', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 20:39:55,750.750 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hair', 'toilet', 'shirt', 'hand', 'leg', 'boy', 'floor', 'wall', 'sock', 'bowl', 'shoe', 'head', 'child', 'bathroom', 'seat', 'window', '[UNK]', 'phone', 'book', 'arm', 'lid', 'short', 'face', 'tile', 'tank', 'door', 'person', 'young', 'handle', 'elbow', 'girl', 'foot', 'boot', 'reflection', 'ear', 'cell', 'curtain', 'brush', 'nose', 'black', 'remote', 'water', 'paper', 'room', 'picture', 'light', 'pipe', 'ceiling', 'photo', 'ledge'] 2022-03-16 20:40:11,704.704 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'hand', 'face', 'small', 'book', 'hair', 'girl', 'floor', 'child', 'seat', 'boy', 'base', 'window', 'box', 'shirt', 'leg', 'clothes', 'bowl', 'electronic', 'device', 'bathroom', 'toilet', 'tile', 'sock'] 2022-03-16 20:42:35,524.524 2829:trainer.py:487 do_train_dict(): eta: 15:22:33 iter: 33800 speed: 288.6 images/sec total_norm: 140.3011 (144.1053) loss: 145.0025 (144.8953) masked_loss: 1.5766 (1.5926) tag_loss: 143.4026 (143.3026) time: 1.4325 (1.7744) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4272 (1.7692) save_time: 8.8421 (19.6009) lr: 0.000049 max mem: 26307 2022-03-16 20:42:35,886.886 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5625 2022-03-16 20:42:35,886.886 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 128.20892333984375 2022-03-16 20:42:35,886.886 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.74572365684847 2022-03-16 20:42:53,181.181 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02030801586806774 2022-03-16 20:42:53,181.181 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 20:42:53,182.182 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'white', 'vase', 'with', 'several', '[MASK]', 'flowers', 'in', 'it', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 20:42:53,197.197 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hand', 'flower', 'vase', 'finger', 'shirt', 'man', 'ring', 'table', 'person', 'rose', '[UNK]', 'arm', 'napkin', 'leaf', 'wall', 'mouth', 'handle', 'neck', 'phone', 'pitcher', 'cup', 'button', 'container', 'plate', 'paper', 'cell', 'hair', 'cloth', 'base', 'face', 'cake', 'white', 'watch', 'woman', 'elbow', 'candy', 'top', 'jug', 'thumb', 'design', 'glass', 'chair', 'wrist', 'rim', 'remote', 'light', 'ear', 'pink', 'lid', 'stripe'] 03-16 20:42:57.047 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 20:42:57.047 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 20:42:57.730 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 13}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 13}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 13}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 13}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 13}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}] 2022-03-16 20:43:09,179.179 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'hand', 'several', 'white', 'person', 'table', 'ring', 'finger', 'handle', 'salt', 'flower', 'stem', 'elbow', 'candy', 'pepper', 'colorful', 'vase', 'jug'] 2022-03-16 20:45:32,641.641 2829:trainer.py:487 do_train_dict(): eta: 15:19:50 iter: 33900 speed: 289.1 images/sec total_norm: 141.5618 (144.2980) loss: 143.2617 (142.4413) masked_loss: 1.5043 (1.5261) tag_loss: 141.7666 (140.9151) time: 1.4326 (1.7712) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4274 (1.7661) save_time: 8.8421 (19.6009) lr: 0.000049 max mem: 26307 2022-03-16 20:45:33,002.002 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6470588445663452 2022-03-16 20:45:33,002.002 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 145.77383422851562 2022-03-16 20:45:33,002.002 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.7429765028112 2022-03-16 20:45:50,244.244 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02031378448009491 2022-03-16 20:45:50,245.245 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 20:45:50,245.245 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'is', 'lighting', '[MASK]', 'piece', 'if', 'cake', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 20:45:50,260.260 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hand', 'cake', 'plate', 'table', 'finger', 'candle', 'person', 'fork', '[UNK]', 'flame', 'shadow', 'handle', 'knife', 'ring', 'white', 'wall', 'man', 'background', 'lit', 'blade', 'wrist', 'arm', 'food', 'design', 'thumb', 'light', 'shirt', 'napkin', 'photo', 'piece', 'picture', 'glass', 'woman', 'spoon', 'head', 'cloth', 'nail', 'top', 'birthday', 'logo', 'face', 'stem', 'eye', 'sleeve', 'dark', 'small', 'watch', 'front', 'stick', 'couple'] 2022-03-16 20:46:06,104.104 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'hand', 'person', 'table', 'arm', 'piece', 'finger', 'handle', 'plate', 'shadow', 'knife', 'flame', 'fork', 'cake', 'candle'] 2022-03-16 20:48:30,199.199 2829:trainer.py:487 do_train_dict(): eta: 15:17:08 iter: 34000 speed: 288.4 images/sec total_norm: 139.9951 (143.6482) loss: 145.9922 (145.1682) masked_loss: 1.5660 (1.5908) tag_loss: 144.3510 (143.5774) time: 1.4319 (1.7756) data: 0.0001 (0.0002) to_device: 0.0052 (0.0051) time_gpu: 1.4267 (1.7703) save_time: 8.8421 (19.6009) lr: 0.000049 max mem: 26307 2022-03-16 20:48:30,560.560 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6470588445663452 2022-03-16 20:48:30,560.560 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 139.73997497558594 2022-03-16 20:48:30,560.560 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.74217898684863 2022-03-16 20:48:48,098.098 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020349852740764618 2022-03-16 20:48:48,099.099 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 20:48:48,099.099 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'black', 'bunk', 'bed', '[MASK]', 'a', 'room', 'with', 'the', 'name', 'palmer', 'on', '##pled', 'wall', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 20:48:48,115.115 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'bed', 'floor', 'window', 'ladder', 'carpet', 'bunk', 'room', 'ceiling', 'outlet', '[UNK]', 'sheet', 'pillow', 'frame', 'toy', 'shelf', 'light', 'blanket', 'bedroom', 'post', 'curtain', 'sign', 'mattress', 'decoration', 'animal', 'door', 'lamp', 'star', 'stripe', 'fan', 'leg', 'drawer', 'handle', 'small', 'switch', 'flower', 'board', 'picture', 'bar', 'rack', 'mat', 'chair', 'blind', 'mirror', 'rug', 'knob', 'head', 'blue', 'rail', 'object'] 2022-03-16 20:49:04,067.067 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'name', 'room', 'black', 'light', 'board', 'floor', 'bed', 'wall', 'stand', 'window', 'ball', 'bedroom', 'fan', 'ceiling', 'shade', 'toy', 'carpet', 'ladder', 'curtain', 'shelf', 'outlet', 'stripe', 'bunk'] 2022-03-16 20:51:27,588.588 2829:trainer.py:487 do_train_dict(): eta: 15:14:25 iter: 34100 speed: 288.6 images/sec total_norm: 141.9835 (142.5334) loss: 143.0666 (144.1223) masked_loss: 1.4782 (1.5240) tag_loss: 141.9223 (142.5983) time: 1.4321 (1.7738) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4269 (1.7686) save_time: 8.8421 (19.6009) lr: 0.000049 max mem: 26307 2022-03-16 20:51:27,947.947 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6363636255264282 2022-03-16 20:51:27,948.948 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 162.42034912109375 2022-03-16 20:51:27,948.948 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.73802888881393 2022-03-16 20:51:45,256.256 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02031753584742546 2022-03-16 20:51:45,256.256 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 20:51:45,256.256 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'beautiful', 'woman', 'holding', 'a', 'tennis', 'ra', '##c', '##quet', 'on', 'a', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 20:51:45,272.272 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hand', 'hair', '[UNK]', 'tennis', 'shirt', 'belt', 'ball', 'woman', 'ponytail', 'head', 'court', 'arm', 'face', 'leg', 'player', 'nose', 'handle', 'ground', 'mouth', 'eye', 'band', 'string', 'ear', 'net', 'short', 'line', 'wrist', 'waist', 'watch', 'wall', 'logo', 'pony', 'tail', 'fence', 'outfit', 'finger', 'top', 'bracelet', 'buckle', 'stripe', 'uniform', 'person', 'tape', 'female', 'ready', 'sleeve', 'shoe', 'sock', 'game', 'white'] 2022-03-16 20:52:01,107.107 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'hand', 'face', 'band', 'player', 'woman', 'court', 'ground', 'hair', 'mouth', 'arm', 'eye', 'beautiful', 'ball', 'ring', 'shirt', 'finger', 'nose', 'ear', 'pocket', 'handle', 'tennis', 'string', 'belt', 'net', 'waist', 'wrist', 'ponytail'] 2022-03-16 20:54:25,035.035 2829:trainer.py:487 do_train_dict(): eta: 15:11:43 iter: 34200 speed: 288.5 images/sec total_norm: 141.1525 (145.2789) loss: 148.2432 (148.3236) masked_loss: 1.6071 (1.6107) tag_loss: 146.1701 (146.7129) time: 1.4327 (1.7744) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4277 (1.7693) save_time: 8.8421 (19.6009) lr: 0.000049 max mem: 26307 2022-03-16 20:54:25,399.399 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6764705777168274 2022-03-16 20:54:25,399.399 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 134.07925415039062 2022-03-16 20:54:25,400.400 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.74150257833497 2022-03-16 20:54:42,913.913 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020320162177085876 2022-03-16 20:54:42,913.913 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 20:54:42,913.913 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', '[MASK]', 'and', 'canvas', 'chairs', ',', 'one', 'tilted', 'forward', 'with', 'a', 'cat', 'laying', 'under', '[MASK]', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 20:54:42,928.928 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['cat', 'chair', 'wall', 'ground', 'ear', 'floor', 'head', 'door', 'leg', 'table', 'flower', 'cushion', '[UNK]', 'tail', 'cloth', 'pillow', 'paw', 'kitten', 'blanket', 'leaf', 'nose', 'eye', 'seat', 'shadow', 'window', 'small', 'back', 'top', 'mat', 'next', 'dot', 'building', 'cord', 'curtain', 'orange', 'white', 'circle', 'wooden', 'wheel', 'room', 'line', 'front', 'wood', 'reflection', 'brown', 'sidewalk', 'face', 'patio', 'paper', 'magazine'] 2022-03-16 20:54:58,833.833 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'line', 'door', 'ground', 'floor', 'table', 'wall', 'chair', 'wood', 'leg', 'cat', 'net', 'flower', 'canvas'] 2022-03-16 20:57:22,671.671 2829:trainer.py:487 do_train_dict(): eta: 15:09:00 iter: 34300 speed: 288.2 images/sec total_norm: 144.4255 (146.0896) loss: 146.3049 (145.7596) masked_loss: 1.5811 (1.5534) tag_loss: 145.1051 (144.2063) time: 1.4333 (1.7764) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4281 (1.7713) save_time: 8.8421 (19.6009) lr: 0.000048 max mem: 26307 2022-03-16 20:57:23,031.031 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6875 2022-03-16 20:57:23,031.031 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 172.26119995117188 2022-03-16 20:57:23,032.032 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.73583235851554 2022-03-16 20:57:40,437.437 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020339546725153923 2022-03-16 20:57:40,437.437 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 20:57:40,438.438 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'and', 'several', '[MASK]', 'standing', 'in', 'a', 'pet', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 20:57:40,453.453 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['cage', 'shirt', 'man', 'glasses', 'bird', 'hair', 'shelf', '[UNK]', 'bag', 'woman', 'hand', 'container', 'person', 'box', 'crate', 'cart', 'head', 'table', 'arm', 'shoe', 'watch', 'face', 'basket', 'banana', 'bottle', 'building', 'bucket', 'bin', 'case', 'jean', 'door', 'strap', 'tray', 'sign', 'jug', 'store', 'cooler', 'lady', 'cap', 'shop', 'floor', 'rack', 'light', 'food', 'ground', 'jacket', 'hat', 'girl', 'ceiling', 'glove'] 2022-03-16 20:57:56,366.366 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'several', 'woman', 'hair', 'girl', 'person', 'child', 'table', 'lady', 'watch', 'box', 'jean', 'shirt', 'shop', 'bag', 'bird', 'belt', 'glasses', 'cage', 'purse', 'pet', 'boot', 'skirt', 'ladder', 'cart', 'shelf', 'container', 'tray', 'banana', 'scissors', 'crate'] 2022-03-16 21:00:20,309.309 2829:trainer.py:487 do_train_dict(): eta: 15:06:18 iter: 34400 speed: 288.2 images/sec total_norm: 143.8343 (148.4238) loss: 147.6006 (147.1369) masked_loss: 1.4889 (1.5125) tag_loss: 145.9933 (145.6244) time: 1.4318 (1.7764) data: 0.0001 (0.0005) to_device: 0.0051 (0.0050) time_gpu: 1.4264 (1.7709) save_time: 8.8421 (19.6009) lr: 0.000048 max mem: 26307 2022-03-16 21:00:20,670.670 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5 2022-03-16 21:00:20,671.671 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 131.43788146972656 2022-03-16 21:00:20,671.671 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.74925847813704 2022-03-16 21:00:38,221.221 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020341960713267326 2022-03-16 21:00:38,221.221 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 21:00:38,221.221 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', '[MASK]', 'ready', 'to', 'launch', 'a', 'colorful', 'kite', 'on', 'the', 'beach', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 21:00:38,236.236 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'shirt', 'hat', 'cloud', 'man', 'kite', 'building', '[UNK]', 'hand', 'head', 'arm', 'stair', 'grass', 'tree', 'railing', 'ground', 'flag', 'sunglasses', 'person', 'fence', 'short', 'boy', 'leg', 'shoe', 'wall', 'roof', 'jean', 'bag', 'bush', 'child', 'window', 'step', 'pole', 'shadow', 'sidewalk', 'bridge', 'cap', 'woman', 'ladder', 'umbrella', 'flower', 'tail', 'hair', 'post', 'sign', 'house', 'truck', 'glasses', 'car', 'park'] 2022-03-16 21:00:54,082.082 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'house', 'hand', 'face', 'building', 'short', 'ground', 'post', 'arm', 'hill', 'date', 'ready', 'foot', 'beach', 'sky', 'shirt', 'leg', 'roof', 'grass', 'hat', 'cloud', 'colorful', 'kite'] 2022-03-16 21:03:17,806.806 2829:trainer.py:487 do_train_dict(): eta: 15:03:35 iter: 34500 speed: 288.5 images/sec total_norm: 143.5294 (145.6437) loss: 143.9007 (144.6687) masked_loss: 1.5323 (1.5347) tag_loss: 142.4247 (143.1339) time: 1.4313 (1.7750) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4261 (1.7698) save_time: 8.8421 (19.6009) lr: 0.000048 max mem: 26307 2022-03-16 21:03:18,166.166 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.53125 2022-03-16 21:03:18,166.166 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 140.76185607910156 2022-03-16 21:03:18,166.166 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.75202440250816 2022-03-16 21:03:35,964.964 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02033899910748005 2022-03-16 21:03:35,964.964 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 21:03:35,965.965 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'animal', '[MASK]', 'sitting', 'in', 'between', 'two', 'pillows', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 21:03:35,980.980 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['bed', 'pillow', 'wall', '[UNK]', 'head', 'eye', 'bear', 'ear', 'teddy', 'knob', 'arm', 'blanket', 'shadow', 'drawer', 'table', 'post', 'animal', 'window', 'nightstand', 'nose', 'panel', 'top', 'shade', 'frame', 'stuffed', 'leg', 'sheet', 'lamp', 'dresser', 'face', 'bolt', 'light', 'cover', 'wood', 'bedroom', 'board', 'picture', 'wooden', 'design', 'small', 'foot', 'room', 'white', 'paw', 'toy', 'flower', 'laying', 'next', 'chair', 'screw'] 2022-03-16 21:03:51,937.937 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'bed', 'wall', 'arm', 'eye', 'animal', 'ear', 'bear', 'shadow', 'blanket', 'pillow', 'lamp', 'teddy', 'stuffed', 'drawer', 'strap', 'knob'] 2022-03-16 21:06:15,636.636 2829:trainer.py:487 do_train_dict(): eta: 15:00:52 iter: 34600 speed: 287.9 images/sec total_norm: 141.7339 (143.3657) loss: 143.4248 (144.9419) masked_loss: 1.4986 (1.5056) tag_loss: 141.2679 (143.4364) time: 1.4327 (1.7783) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4276 (1.7731) save_time: 8.8421 (19.6009) lr: 0.000048 max mem: 26307 2022-03-16 21:06:15,998.998 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6176470518112183 2022-03-16 21:06:15,999.999 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 141.89132690429688 2022-03-16 21:06:15,999.999 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.76205342097653 2022-03-16 21:06:33,543.543 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020345093682408333 2022-03-16 21:06:33,543.543 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 21:06:33,544.544 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'woman', ',', 'one', 'holding', 'a', 'chicken', 'and', 'one', '[MASK]', 'a', 'don', '##ut', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 21:06:33,559.559 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hand', 'shirt', 'hair', 'face', 'box', 'wall', 'table', 'head', 'eye', 'woman', 'girl', 'nose', '[UNK]', 'window', 'bag', 'mouth', 'couch', 'dress', 'paper', 'glasses', 'chair', 'napkin', 'arm', 'ear', 'child', 'picture', 'man', 'food', 'watch', 'bracelet', 'cup', 'pillow', 'door', 'straw', 'floor', 'finger', 'hat', 'knife', 'plant', 'necklace', 'wrist', 'bow', 'book', 'tissue', 'cabinet', 'bird', 'neck', 'glass', 'lady', 'handle'] 2022-03-16 21:06:49,440.440 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'hand', 'face', 'woman', 'hair', 'mouth', 'table', 'wall', 'arm', 'eye', 'chair', 'paper', 'box', 'shirt', 'dress', 'nose', 'chicken', 'curtain', 'napkin'] 2022-03-16 21:09:13,324.324 2829:trainer.py:487 do_train_dict(): eta: 14:58:09 iter: 34700 speed: 288.1 images/sec total_norm: 143.1438 (145.1093) loss: 143.4171 (145.1008) masked_loss: 1.6046 (1.6009) tag_loss: 141.8913 (143.4999) time: 1.4326 (1.7768) data: 0.0001 (0.0001) to_device: 0.0051 (0.0051) time_gpu: 1.4274 (1.7716) save_time: 8.8421 (19.6009) lr: 0.000048 max mem: 26307 2022-03-16 21:09:13,684.684 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.8333333134651184 2022-03-16 21:09:13,684.684 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 144.01309204101562 2022-03-16 21:09:13,685.685 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.770817603188 2022-03-16 21:09:31,507.507 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020339904353022575 2022-03-16 21:09:31,507.507 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 21:09:31,507.507 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'people', 'in', '[MASK]', 'coats', '[MASK]', 'doing', 'something', 'they', 'pulled', 'up', '[MASK]', 'their', 'laptop', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 21:09:31,523.523 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['laptop', 'keyboard', 'hand', 'hair', 'table', 'screen', 'woman', 'shirt', 'key', 'computer', 'person', '[UNK]', 'desk', 'wall', 'glasses', 'pole', 'jacket', 'chair', 'tray', 'head', 'sleeve', 'man', 'pen', 'ear', 'face', 'picture', 'handle', 'bottle', 'girl', 'sunglasses', 'container', 'cord', 'mouse', 'bag', 'food', 'cup', 'purse', 'light', 'scissors', 'boy', 'fork', 'box', 'plate', 'ring', 'glass', 'arm', 'logo', 'lamp', 'spoon', 'paper'] 2022-03-16 21:09:47,513.513 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'something', 'woman', 'hair', 'person', 'table', 'wall', 'key', 'computer', 'shirt', 'screen', 'ear', 'desk', 'handle', 'coat', 'pan', 'jacket', 'lab', 'pen', 'glasses', 'logo', 'brush', 'keyboard', 'tray', 'laptop', 'sunglasses'] 2022-03-16 21:12:11,473.473 2829:trainer.py:487 do_train_dict(): eta: 14:55:27 iter: 34800 speed: 287.4 images/sec total_norm: 142.2603 (144.7059) loss: 142.1245 (144.3255) masked_loss: 1.5166 (1.5313) tag_loss: 140.5276 (142.7943) time: 1.4339 (1.7815) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4290 (1.7763) save_time: 8.8421 (19.6009) lr: 0.000048 max mem: 26307 2022-03-16 21:12:11,833.833 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6571428775787354 2022-03-16 21:12:11,834.834 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 140.48550415039062 2022-03-16 21:12:11,834.834 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.76653524797761 2022-03-16 21:12:29,800.800 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020334212109446526 2022-03-16 21:12:29,801.801 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 21:12:29,801.801 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'feelings', 'old', 'computer', '[MASK]', 'has', 'been', 'decorated', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 21:12:29,816.816 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['screen', 'table', 'button', 'wall', 'television', 'key', 'keyboard', 'panel', 'desk', 'reflection', 'light', '[UNK]', 'phone', 'knob', 'remote', 'drawer', 'old', 'wooden', 'cabinet', 'control', 'design', 'door', 'wire', 'speaker', 'top', 'box', 'cord', 'computer', 'next', 'dial', 'logo', 'shadow', 'monitor', 'book', 'small', 'mouse', 'colorful', 'set', 'tray', 'floor', 'room', 'paper', 'number', 'curtain', 'cell', 'handle', 'picture', 'tv', 'close', 'red'] 2022-03-16 21:12:45,731.731 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['old', 'television', 'table', 'wall', 'phone', 'key', 'computer', 'screen', 'desk', 'speaker', 'button', 'keyboard', 'reflection', 'drawer', 'dial', 'knob'] 03-16 21:12:57.831 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 21:12:57.831 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 21:12:59.144 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 97}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 21:15:09,378.378 2829:trainer.py:487 do_train_dict(): eta: 14:52:44 iter: 34900 speed: 287.8 images/sec total_norm: 142.3826 (146.3888) loss: 144.5862 (144.5441) masked_loss: 1.4906 (1.5241) tag_loss: 142.8866 (143.0199) time: 1.4335 (1.7791) data: 0.0001 (0.0002) to_device: 0.0050 (0.0050) time_gpu: 1.4283 (1.7740) save_time: 8.8421 (19.6009) lr: 0.000047 max mem: 26307 2022-03-16 21:15:09,738.738 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5135135054588318 2022-03-16 21:15:09,739.739 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 123.50486755371094 2022-03-16 21:15:09,739.739 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.7805048588344 2022-03-16 21:15:27,494.494 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020334094762802124 2022-03-16 21:15:27,494.494 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 21:15:27,495.495 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'several', 'zebra', '[MASK]', 'are', 'standing', '[MASK]', 'to', 'the', 'camera', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 21:15:27,510.510 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['zebra', 'ear', 'eye', 'mane', 'head', 'nose', 'ground', 'leg', 'stripe', '[UNK]', 'mouth', 'face', 'neck', 'dirt', 'other', 'grass', 'rock', 'foot', 'close', 'next', 'spot', 'hair', 'tree', 'back', 'field', 'chin', 'plant', 'area', 'white', 'fence', 'bush', 'group', 'muzzle', 'body', 'hay', 'background', 'side', 'branch', 'wall', 'shadow', 'leaf', 'herd', 'snout', 'view', 'couple', 'road', 'picture', 'paw', 'camera', 'trunk'] 2022-03-16 21:15:43,439.439 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'face', 'ground', 'mouth', 'eye', 'neck', 'foot', 'leg', 'nose', 'ear', 'camera', 'stripe', 'mane', 'zebra'] 2022-03-16 21:18:07,277.277 2829:trainer.py:487 do_train_dict(): eta: 14:50:01 iter: 35000 speed: 287.8 images/sec total_norm: 143.2993 (145.0793) loss: 140.3727 (141.5709) masked_loss: 1.4522 (1.4828) tag_loss: 138.5324 (140.0880) time: 1.4319 (1.7790) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4267 (1.7738) save_time: 8.8421 (19.6009) lr: 0.000047 max mem: 26307 2022-03-16 21:18:07,279.279 2829:checkpoint.py:222 save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt 2022-03-16 21:18:16,824.824 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5625 2022-03-16 21:18:16,824.824 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 150.883544921875 2022-03-16 21:18:16,824.824 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.78594291447914 2022-03-16 21:18:34,897.897 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02035629004240036 2022-03-16 21:18:34,897.897 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 21:18:34,898.898 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'that', 'is', 'on', 'her', '[MASK]', 'phone', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 21:18:34,913.913 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hand', 'tire', 'woman', 'face', 'leg', 'bracelet', 'hair', 'car', 'head', 'ring', 'dress', 'sunglasses', 'phone', 'glasses', 'bench', 'wheel', '[UNK]', 'shadow', 'wall', 'plant', 'finger', 'shoe', 'window', 'cell', 'street', 'nose', 'girl', 'building', 'arm', 'shirt', 'sidewalk', 'short', 'light', 'person', 'mouth', 'brick', 'ground', 'wrist', 'suv', 'foot', 'tree', 'road', 'pot', 'rim', 'bush', 'heel', 'man', 'handle', 'lady', 'weed'] 2022-03-16 21:18:50,635.635 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'hand', 'face', 'street', 'woman', 'car', 'ground', 'hair', 'girl', 'person', 'wall', 'phone', 'lady', 'plant', 'window', 'tree', 'cell', 'branch', 'ring', 'block', 'leg', 'dress', 'bag', 'shadow', 'wheel', 'pole', 'flower', 'bench', 'leaf', 'glasses', 'tire', 'sunglasses', 'bracelet'] 2022-03-16 21:21:13,592.592 2829:trainer.py:487 do_train_dict(): eta: 14:47:26 iter: 35100 speed: 274.8 images/sec total_norm: 143.6857 (145.3042) loss: 144.8091 (145.2275) masked_loss: 1.4594 (1.5247) tag_loss: 143.5081 (143.7028) time: 1.4334 (1.8632) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4281 (1.7663) save_time: 8.8805 (18.1110) lr: 0.000047 max mem: 26307 2022-03-16 21:21:13,953.953 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.40625 2022-03-16 21:21:13,953.953 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 135.30152893066406 2022-03-16 21:21:13,953.953 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.79033785516566 2022-03-16 21:21:31,900.900 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02036704309284687 2022-03-16 21:21:31,901.901 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 21:21:31,901.901 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'standing', '[MASK]', 'a', 'room', 'next', 'to', 'lots', 'of', '[MASK]', 'chairs', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 21:21:31,917.917 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tie', 'curtain', 'shirt', 'man', '[UNK]', 'chair', 'floor', 'carpet', 'light', 'shoe', 'ceiling', 'person', 'hand', 'arm', 'room', 'belt', 'hair', 'table', 'head', 'dress', 'suit', 'wall', 'leg', 'jacket', 'building', 'line', 'red', 'stage', 'sign', 'front', 'shadow', 'bow', 'reflection', 'window', 'column', 'hat', 'jean', 'woman', 'formal', 'white', 'ground', 'screen', 'ball', 'sky', 'pole', 'face', 'black', 'blue', 'large', 'letter'] 2022-03-16 21:21:47,824.824 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'room', 'white', 'light', 'person', 'floor', 'arm', 'chair', 'window', 'sign', 'shirt', 'tie', 'waist', 'ceiling', 'jacket', 'carpet', 'shoe', 'curtain'] 2022-03-16 21:24:11,613.613 2829:trainer.py:487 do_train_dict(): eta: 14:44:43 iter: 35200 speed: 287.6 images/sec total_norm: 141.3476 (142.8795) loss: 140.0146 (139.5841) masked_loss: 1.4799 (1.5152) tag_loss: 138.8960 (138.0689) time: 1.4325 (1.7803) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4275 (1.7752) save_time: 8.8805 (18.1110) lr: 0.000047 max mem: 26307 2022-03-16 21:24:11,976.976 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4864864945411682 2022-03-16 21:24:11,976.976 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 128.02099609375 2022-03-16 21:24:11,976.976 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.81322726717076 2022-03-16 21:24:30,056.056 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020360779017210007 2022-03-16 21:24:30,056.056 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 21:24:30,057.057 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'view', '[MASK]', 'a', 'living', 'room', '[MASK]', 'couch', '##es', 'and', 'chairs', '[MASK]', 'on', 'a', 'carpet', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 21:24:30,072.072 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['curtain', 'room', 'chair', 'wall', 'ceiling', 'picture', 'light', 'window', 'lamp', 'table', 'floor', 'plant', 'shade', 'couch', 'carpet', 'television', 'living', 'blanket', 'pillow', 'screen', 'sofa', 'pot', '[UNK]', 'flower', 'vase', 'leg', 'arm', 'painting', 'armchair', 'monitor', 'cushion', 'area', 'stand', 'large', 'shelf', 'outlet', 'top', 'computer', 'door', 'blade', 'fan', 'poster', 'ottoman', 'glass', 'laptop', 'switch', 'paper', 'book', 'furniture', 'end'] 2022-03-16 21:24:46,024.024 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['room', 'light', 'living', 'television', 'floor', 'table', 'wall', 'view', 'chair', 'plant', 'foot', 'window', 'picture', 'screen', 'bird', 'ceiling', 'couch', 'monitor', 'shade', 'pot', 'pillow', 'carpet', 'lamp', 'sofa', 'curtain'] 2022-03-16 21:27:09,703.703 2829:trainer.py:487 do_train_dict(): eta: 14:42:00 iter: 35300 speed: 287.5 images/sec total_norm: 142.3346 (145.9948) loss: 142.2887 (144.3121) masked_loss: 1.4734 (1.4894) tag_loss: 141.1812 (142.8227) time: 1.4339 (1.7809) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4289 (1.7756) save_time: 8.8805 (18.1110) lr: 0.000047 max mem: 26307 2022-03-16 21:27:10,066.066 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.529411792755127 2022-03-16 21:27:10,067.067 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 122.63671875 2022-03-16 21:27:10,067.067 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.81355590604792 2022-03-16 21:27:28,259.259 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02044505812227726 2022-03-16 21:27:28,259.259 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 21:27:28,260.260 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', '[MASK]', 'four', 'fruits', 'put', 'inside', 'it', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 21:27:28,275.275 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['fruit', 'carrot', 'leaf', 'orange', 'table', '[UNK]', 'stem', 'top', 'plant', 'food', 'flower', 'ground', 'shadow', 'onion', 'vegetable', 'apple', 'hole', 'bowl', 'banana', 'spot', 'reflection', 'object', 'bunch', 'bottom', 'mushroom', 'inside', 'rim', 'tomato', 'bag', 'surface', 'group', 'light', 'wood', 'berry', 'background', 'red', 'next', 'piece', 'scissors', 'close', 'writing', 'pot', 'nut', 'end', 'other', 'ball', 'handle', 'cup', 'branch', 'black'] 2022-03-16 21:27:44,242.242 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['orange', 'fruit', 'flower', 'leaf', 'reflection', 'onion'] 2022-03-16 21:30:07,841.841 2829:trainer.py:487 do_train_dict(): eta: 14:39:17 iter: 35400 speed: 287.4 images/sec total_norm: 143.0172 (145.7120) loss: 145.4353 (146.1105) masked_loss: 1.5128 (1.5328) tag_loss: 143.7887 (144.5778) time: 1.4334 (1.7814) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4283 (1.7763) save_time: 8.8805 (18.1110) lr: 0.000047 max mem: 26307 2022-03-16 21:30:08,203.203 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5428571701049805 2022-03-16 21:30:08,203.203 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 155.83151245117188 2022-03-16 21:30:08,203.203 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.80979587393747 2022-03-16 21:30:26,116.116 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020453158766031265 2022-03-16 21:30:26,117.117 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 21:30:26,117.117 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'kite', '##s', 'are', 'flying', 'across', 'the', '[MASK]', '[MASK]', 'winds', '##ur', '##fers', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 21:30:26,132.132 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'kite', 'person', '[UNK]', 'string', 'tail', 'grass', 'air', 'man', 'ground', 'cloud', 'beach', 'horizon', 'water', 'tree', 'shirt', 'hill', 'field', 'sand', 'ocean', 'group', 'flag', 'leg', 'pole', 'building', 'mountain', 'hair', 'background', 'arm', 'line', 'head', 'roof', 'jacket', 'wave', 'hand', 'parachute', 'couple', 'shadow', 'shore', 'fence', 'woman', 'leaf', 'short', 'light', 'shoe', 'house', 'car', 'sun', 'top', 'boat'] 2022-03-16 21:30:42,015.015 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'city', 'man', 'air', 'water', 'building', 'person', 'distance', 'tree', 'beach', 'sky', 'ocean', 'wave', 'shore', 'cloud', 'horizon', 'kite', 'surfer'] 2022-03-16 21:33:05,971.971 2829:trainer.py:487 do_train_dict(): eta: 14:36:34 iter: 35500 speed: 287.4 images/sec total_norm: 141.0793 (142.6804) loss: 141.5518 (143.5472) masked_loss: 1.4241 (1.5119) tag_loss: 140.0307 (142.0352) time: 1.4325 (1.7813) data: 0.0001 (0.0005) to_device: 0.0051 (0.0050) time_gpu: 1.4272 (1.7757) save_time: 8.8805 (18.1110) lr: 0.000047 max mem: 26307 2022-03-16 21:33:06,334.334 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4848484992980957 2022-03-16 21:33:06,334.334 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 154.16534423828125 2022-03-16 21:33:06,334.334 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.81255073761672 2022-03-16 21:33:24,363.363 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020433004945516586 2022-03-16 21:33:24,364.364 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 21:33:24,365.365 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'elephants', 'in', 'a', 'field', '##op', '[MASK]', 'tree', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 21:33:24,380.380 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['grass', 'head', 'field', 'bush', 'ear', 'tree', 'leg', 'nose', 'mane', 'tail', 'zebra', 'face', 'neck', '[UNK]', 'mouth', 'hill', 'eye', 'stripe', 'background', 'hair', 'spot', 'sky', 'shadow', 'ground', 'body', 'horn', 'back', 'tall', 'brush', 'rock', 'photo', 'white', 'plant', 'bird', 'grassy', 'large', 'dry', 'animal', 'water', 'couple', 'next', 'black', 'snout', 'foot', 'trunk', 'baby', 'herd', 'mountain', 'group', 'cow'] 2022-03-16 21:33:40,239.239 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'field', 'tree', 'ear', 'palm', 'grass', 'tail', 'leaf', 'trunk', 'elephant'] 2022-03-16 21:36:04,164.164 2829:trainer.py:487 do_train_dict(): eta: 14:33:51 iter: 35600 speed: 287.3 images/sec total_norm: 145.8818 (148.3713) loss: 144.7203 (143.7997) masked_loss: 1.5333 (1.5217) tag_loss: 143.2545 (142.2780) time: 1.4328 (1.7819) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4277 (1.7767) save_time: 8.8805 (18.1110) lr: 0.000046 max mem: 26307 2022-03-16 21:36:04,526.526 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.800000011920929 2022-03-16 21:36:04,526.526 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 150.14825439453125 2022-03-16 21:36:04,526.526 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.80533910799427 2022-03-16 21:36:22,790.790 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020455343648791313 2022-03-16 21:36:22,791.791 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 21:36:22,791.791 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', '##raf', '##fe', 'out', 'in', 'the', 'wild', 'on', '[MASK]', 'sunny', 'day', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 21:36:22,806.806 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'tree', 'grass', 'bush', 'hill', 'head', 'field', '[UNK]', 'neck', 'leg', 'ear', 'tail', 'mane', 'horn', 'face', 'ground', 'spot', 'mouth', 'nose', 'stripe', 'body', 'eye', 'zebra', 'distance', 'mountain', 'horizon', 'cloud', 'background', 'grassy', 'large', 'plain', 'dirt', 'hair', 'tall', 'dry', 'next', 'other', 'herd', 'baby', 'elephant', 'standing', 'small', 'brush', 'trunk', 'open', 'plant', 'group', 'couple', 'shadow', 'day'] 2022-03-16 21:36:38,658.658 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'day', 'body', 'field', 'ground', 'hill', 'plant', 'neck', 'sky', 'wild', 'spot', 'leg', 'ear', 'grass', 'tail', 'bush', 'leaf', 'horn', 'sunny'] 2022-03-16 21:39:02,531.531 2829:trainer.py:487 do_train_dict(): eta: 14:31:08 iter: 35700 speed: 287.0 images/sec total_norm: 144.3022 (146.4315) loss: 139.9591 (141.2245) masked_loss: 1.4827 (1.4689) tag_loss: 138.5107 (139.7556) time: 1.4334 (1.7837) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4283 (1.7785) save_time: 8.8805 (18.1110) lr: 0.000046 max mem: 26307 2022-03-16 21:39:02,892.892 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-16 21:39:02,892.892 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 117.20672607421875 2022-03-16 21:39:02,892.892 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.81341491997576 2022-03-16 21:39:20,991.991 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020464623346924782 2022-03-16 21:39:20,992.992 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 21:39:20,992.992 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'person', '[MASK]', 'two', 'plates', '[MASK]', 'pizza', 'and', 'some', 'glasses', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 21:39:21,007.007 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['glass', 'pizza', 'table', 'plate', 'wine', 'hand', 'base', 'fork', 'person', 'shrimp', 'knife', 'stem', 'ring', 'crust', 'slice', 'food', '[UNK]', 'napkin', 'finger', 'shirt', 'woman', 'handle', 'bottom', 'bottle', 'onion', 'vase', 'white', 'wall', 'chair', 'tomato', 'cup', 'dish', 'watch', 'holder', 'couple', 'bowl', 'glasses', 'water', 'top', 'red', 'cheese', 'arm', 'curtain', 'drink', 'cloth', 'necklace', 'light', 'neck', 'wrist', 'topping'] 2022-03-16 21:39:36,864.864 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'hand', 'woman', 'hair', 'person', 'table', 'wall', 'food', 'glass', 'ring', 'bottom', 'drink', 'wine', 'plate', 'knife', 'blade', 'fork', 'dish', 'pizza', 'pepper', 'crust'] 2022-03-16 21:42:00,807.807 2829:trainer.py:487 do_train_dict(): eta: 14:28:24 iter: 35800 speed: 287.2 images/sec total_norm: 142.2354 (146.7149) loss: 145.7817 (146.2305) masked_loss: 1.4705 (1.4970) tag_loss: 144.4592 (144.7335) time: 1.4328 (1.7828) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4276 (1.7776) save_time: 8.8805 (18.1110) lr: 0.000046 max mem: 26307 2022-03-16 21:42:01,167.167 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6666666865348816 2022-03-16 21:42:01,168.168 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 129.02682495117188 2022-03-16 21:42:01,168.168 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.82260449550277 2022-03-16 21:42:19,501.501 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02045128494501114 2022-03-16 21:42:19,501.501 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 21:42:19,502.502 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'black', '[MASK]', '[MASK]', 'on', 'top', 'of', 'a', 'microwave', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 21:42:19,517.517 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['ear', 'cat', 'eye', '[UNK]', 'wall', 'microwave', 'door', 'head', 'handle', 'nose', 'window', 'face', 'panel', 'top', 'pen', 'kitchen', 'shelf', 'cup', 'cabinet', 'container', 'logo', 'glass', 'light', 'refrigerator', 'oven', 'label', 'button', 'basket', 'bag', 'black', 'clock', 'display', 'control', 'box', 'knob', 'pencil', 'cord', 'rack', 'book', 'curtain', 'paper', 'counter', 'board', 'metal', 'reflection', 'screen', 'outlet', 'pot', 'paw', 'drawer'] 2022-03-16 21:42:35,437.437 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'face', 'black', 'top', 'door', 'light', 'cup', 'control', 'wall', 'eye', 'window', 'box', 'kitchen', 'nose', 'bag', 'ear', 'bowl', 'display', 'cat', 'handle', 'clock', 'cabinet', 'knife', 'panel', 'button', 'pen', 'container', 'pencil', 'microwave'] 03-16 21:42:59.245 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 21:42:59.245 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 21:43:00.587 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 21:44:59,158.158 2829:trainer.py:487 do_train_dict(): eta: 14:25:41 iter: 35900 speed: 287.1 images/sec total_norm: 144.4305 (145.1420) loss: 142.0477 (142.3598) masked_loss: 1.5310 (1.5445) tag_loss: 140.3518 (140.8152) time: 1.4334 (1.7835) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4279 (1.7783) save_time: 8.8805 (18.1110) lr: 0.000046 max mem: 26307 2022-03-16 21:44:59,519.519 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5161290168762207 2022-03-16 21:44:59,519.519 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 163.127197265625 2022-03-16 21:44:59,519.519 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.82308253182305 2022-03-16 21:45:18,026.026 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020482752472162247 2022-03-16 21:45:18,027.027 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 21:45:18,027.027 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'elephants', 'wadi', '##ng', 'through', 'a', 'lake', 'next', 'to', '[MASK]', 'jungle', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 21:45:18,042.042 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['water', 'elephant', 'ear', 'head', 'eye', 'tree', 'leaf', 'back', 'trunk', 'river', 'body', 'mouth', 'face', 'rock', 'bank', 'tail', 'grass', 'branch', 'shore', '[UNK]', 'bush', 'plant', 'large', 'ripple', 'hair', 'leg', 'splash', 'couple', 'top', 'mud', 'waterfall', 'standing', 'shirt', 'nose', 'stick', 'baby', 'gray', 'next', 'shallow', 'arm', 'big', 'grey', 'man', 'reflection', 'drinking', 'young', 'skin', 'group', 'other', 'muddy'] 2022-03-16 21:45:34,081.081 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['back', 'head', 'next', 'water', 'river', 'lake', 'eye', 'ear', 'grass', 'leaf', 'trunk', 'jungle', 'elephant'] 2022-03-16 21:47:57,667.667 2829:trainer.py:487 do_train_dict(): eta: 14:22:58 iter: 36000 speed: 286.8 images/sec total_norm: 142.9994 (146.3162) loss: 142.5234 (143.5166) masked_loss: 1.5696 (1.5677) tag_loss: 141.0210 (141.9489) time: 1.4322 (1.7851) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4269 (1.7799) save_time: 8.8805 (18.1110) lr: 0.000046 max mem: 26307 2022-03-16 21:47:58,027.027 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6285714507102966 2022-03-16 21:47:58,027.027 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 139.08287048339844 2022-03-16 21:47:58,027.027 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.81849785590767 2022-03-16 21:48:16,291.291 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02050480805337429 2022-03-16 21:48:16,292.292 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 21:48:16,292.292 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'long', 'paved', 'park', 'path', 'lined', 'with', 'benches', 'that', 'are', 'filled', 'with', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 21:48:16,307.307 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['person', 'bench', 'tree', 'ground', 'shadow', 'park', 'man', 'building', 'snow', 'chair', 'trunk', 'sky', 'hat', 'group', 'head', '[UNK]', 'umbrella', 'shirt', 'shoe', 'pole', 'street', 'road', 'sidewalk', 'photo', 'empty', 'woman', 'white', 'jacket', 'old', 'couple', 'mountain', 'leg', 'light', 'branch', 'car', 'bag', 'many', 'bird', 'lamp', 'background', 'coat', 'wall', 'area', 'bush', 'curb', 'bunch', 'pigeon', 'large', 'roof', 'wooden'] 2022-03-16 21:48:32,150.150 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['long', 'man', 'building', 'park', 'ground', 'person', 'tree', 'path', 'shadow', 'bench', 'trunk', 'sidewalk', 'paved', 'curb'] 2022-03-16 21:50:55,932.932 2829:trainer.py:487 do_train_dict(): eta: 14:20:15 iter: 36100 speed: 287.2 images/sec total_norm: 144.2137 (145.6423) loss: 146.4961 (147.4221) masked_loss: 1.5244 (1.5625) tag_loss: 144.9690 (145.8596) time: 1.4319 (1.7826) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4265 (1.7774) save_time: 8.8805 (18.1110) lr: 0.000046 max mem: 26307 2022-03-16 21:50:56,293.293 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.71875 2022-03-16 21:50:56,294.294 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 163.20558166503906 2022-03-16 21:50:56,294.294 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.81455545794239 2022-03-16 21:51:14,578.578 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02050723508000374 2022-03-16 21:51:14,578.578 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 21:51:14,578.578 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'women', 'with', 'very', 'small', 'swim', '##suit', '##s', 'pose', 'and', 'talk', 'on', 'the', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 21:51:14,594.594 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hand', 'woman', 'shirt', 'hair', 'head', 'girl', '[UNK]', 'face', 'chain', 'arm', 'nose', 'man', 'person', 'mouth', 'eye', 'bracelet', 'necklace', 'sunglasses', 'ear', 'dress', 'phone', 'top', 'short', 'skirt', 'ring', 'glasses', 'wall', 'sleeve', 'leg', 'hat', 'stripe', 'finger', 'cell', 'tie', 'suit', 'handle', 'boy', 'purse', 'jean', 'jacket', 'shoe', 'belt', 'watch', 'young', 'child', 'pole', 'bag', 'elbow', 'grass', 'strap'] 2022-03-16 21:51:30,609.609 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'hand', 'face', 'small', 'band', 'book', 'woman', 'hair', 'girl', 'mouth', 'person', 'floor', 'phone', 'eye', 'cell', 'leg', 'dress', 'nose', 'lip', 'collar', 'costume', 'boot', 'bikini', 'bandage', 'sock'] 2022-03-16 21:53:54,338.338 2829:trainer.py:487 do_train_dict(): eta: 14:17:31 iter: 36200 speed: 287.0 images/sec total_norm: 141.8922 (145.1296) loss: 142.7069 (143.6270) masked_loss: 1.5291 (1.5233) tag_loss: 141.0683 (142.1037) time: 1.4321 (1.7841) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4269 (1.7789) save_time: 8.8805 (18.1110) lr: 0.000045 max mem: 26307 2022-03-16 21:53:54,702.702 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6176470518112183 2022-03-16 21:53:54,702.702 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 151.873779296875 2022-03-16 21:53:54,702.702 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.80913908356806 2022-03-16 21:54:13,164.164 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020511014387011528 2022-03-16 21:54:13,164.164 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 21:54:13,165.165 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'there', 'is', 'a', 'yellow', 'fire', 'hydra', '##nt', 'in', 'the', 'middle', 'of', '[MASK]', 'field', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 21:54:13,180.180 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'grass', 'rock', 'sky', 'ground', 'trunk', 'branch', 'park', 'boulder', '[UNK]', 'field', 'bush', 'wood', 'stone', 'bench', 'dirt', 'moss', 'sign', 'log', 'stump', 'leg', 'top', 'green', 'head', 'next', 'grassy', 'structure', 'front', 'post', 'fire', 'slab', 'ear', 'pole', 'forest', 'area', 'lush', 'flower', 'shadow', 'eye', 'letter', 'red', 'face', 'stick', 'middle', 'leaf', 'hill', 'tail', 'white', 'hand', 'roof'] 2022-03-16 21:54:29,041.041 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'top', 'park', 'field', 'fire', 'ground', 'rock', 'middle', 'tree', 'branch', 'sky', 'yellow', 'chain', 'grass', 'cap', 'pole', 'trunk', 'log', 'boulder'] 2022-03-16 21:56:52,809.809 2829:trainer.py:487 do_train_dict(): eta: 14:14:48 iter: 36300 speed: 286.9 images/sec total_norm: 145.9454 (148.2863) loss: 139.2859 (141.0663) masked_loss: 1.5221 (1.5420) tag_loss: 137.9266 (139.5243) time: 1.4317 (1.7847) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4266 (1.7795) save_time: 8.8805 (18.1110) lr: 0.000045 max mem: 26307 2022-03-16 21:56:53,170.170 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.53125 2022-03-16 21:56:53,170.170 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 142.19607543945312 2022-03-16 21:56:53,170.170 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.82295267922538 2022-03-16 21:57:11,659.659 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020537735894322395 2022-03-16 21:57:11,659.659 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 21:57:11,660.660 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'in', 'a', 'suit', 'and', '[MASK]', 'wearing', 'a', 'hat', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 21:57:11,675.675 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['window', 'building', 'tie', 'man', 'wall', 'car', 'button', 'road', 'hat', 'head', 'street', 'jacket', 'hand', 'line', 'sunglasses', 'snow', 'beard', 'face', 'roof', 'mouth', 'coat', 'nose', 'suit', 'sky', 'sidewalk', 'chimney', 'house', 'light', 'pocket', 'shirt', 'sign', '[UNK]', 'pole', 'glasses', 'fence', 'cap', 'ear', 'tire', 'arm', 'ring', 'truck', 'shutter', 'finger', 'city', 'van', 'tree', 'vehicle', 'wire', 'suv', 'parking'] 2022-03-16 21:57:27,539.539 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'house', 'hand', 'face', 'line', 'building', 'road', 'street', 'light', 'car', 'mouth', 'wall', 'van', 'window', 'sky', 'roof', 'nose', 'ear', 'snow', 'suit', 'coat', 'tie', 'tail', 'hat', 'button', 'jacket', 'glasses', 'beard', 'sunglasses', 'chimney', 'windshield'] 2022-03-16 21:59:51,487.487 2829:trainer.py:487 do_train_dict(): eta: 14:12:04 iter: 36400 speed: 286.6 images/sec total_norm: 144.1550 (145.9851) loss: 143.1841 (144.8107) masked_loss: 1.5844 (1.5969) tag_loss: 141.5941 (143.2138) time: 1.4326 (1.7868) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4273 (1.7815) save_time: 8.8805 (18.1110) lr: 0.000045 max mem: 26307 2022-03-16 21:59:51,846.846 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5277777910232544 2022-03-16 21:59:51,847.847 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 150.60223388671875 2022-03-16 21:59:51,847.847 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.81565057414852 2022-03-16 22:00:10,265.265 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020538603886961937 2022-03-16 22:00:10,266.266 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 22:00:10,266.266 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'clear', '[MASK]', 'holds', 'blue', 'and', 'white', 'flowers', 'with', '[MASK]', '##ri', '##gs', 'of', 'greene', '##ry', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 22:00:10,282.282 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['flower', 'wall', 'vase', 'brick', 'frame', 'leaf', 'water', 'window', '[UNK]', 'bouquet', 'base', 'glass', 'design', 'stem', 'table', 'ledge', 'star', 'wood', 'blue', 'white', 'bottom', 'building', 'mirror', 'ground', 'purple', 'decoration', 'board', 'front', 'fence', 'picture', 'knot', 'clear', 'reflection', 'handle', 'blind', 'light', 'chair', 'ball', 'ribbon', 'shadow', 'floor', 'top', 'mat', 'bottle', 'full', 'line', 'arrangement', 'door', 'post', 'colorful'] 2022-03-16 22:00:26,143.143 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'water', 'white', 'blue', 'star', 'wall', 'clear', 'glass', 'frame', 'handle', 'brick', 'flower', 'leaf', 'decoration', 'knot', 'ledge', 'vase'] 2022-03-16 22:02:50,001.001 2829:trainer.py:487 do_train_dict(): eta: 14:09:21 iter: 36500 speed: 286.8 images/sec total_norm: 144.3425 (145.8945) loss: 145.7887 (145.9358) masked_loss: 1.5164 (1.5344) tag_loss: 144.2627 (144.4014) time: 1.4322 (1.7851) data: 0.0001 (0.0001) to_device: 0.0052 (0.0051) time_gpu: 1.4271 (1.7799) save_time: 8.8805 (18.1110) lr: 0.000045 max mem: 26307 2022-03-16 22:02:50,362.362 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6666666865348816 2022-03-16 22:02:50,362.362 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 129.4102783203125 2022-03-16 22:02:50,362.362 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.83403931289423 2022-03-16 22:03:09,089.089 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02055048756301403 2022-03-16 22:03:09,089.089 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 22:03:09,089.089 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'there', 'is', 'a', 'cat', 'sitting', '[MASK]', 'a', 'laptop', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 22:03:09,105.105 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['cat', 'ear', 'keyboard', 'desk', 'book', 'table', 'eye', 'computer', 'door', 'head', 'laptop', 'paper', '[UNK]', 'key', 'mouse', 'screen', 'wall', 'cabinet', 'cord', 'bowl', 'pen', 'bag', 'logo', 'monitor', 'button', 'nose', 'box', 'wire', 'top', 'shelf', 'pad', 'container', 'speaker', 'cd', 'face', 'paw', 'tail', 'front', 'cable', 'pencil', 'lid', 'magazine', 'light', 'picture', 'kitten', 'collar', 'stand', 'cup', 'pot', 'remote'] 2022-03-16 22:03:25,053.053 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'book', 'door', 'star', 'table', 'wall', 'eye', 'paper', 'computer', 'screen', 'nose', 'bag', 'ear', 'desk', 'cat', 'cabinet', 'cable', 'mouse', 'monitor', 'keyboard', 'cord', 'container', 'pad', 'laptop', 'icon', 'mat', 'notebook'] 2022-03-16 22:05:48,753.753 2829:trainer.py:487 do_train_dict(): eta: 14:06:37 iter: 36600 speed: 286.4 images/sec total_norm: 143.2368 (144.7022) loss: 145.1027 (144.2995) masked_loss: 1.4793 (1.5321) tag_loss: 143.4799 (142.7674) time: 1.4329 (1.7875) data: 0.0001 (0.0005) to_device: 0.0051 (0.0050) time_gpu: 1.4277 (1.7820) save_time: 8.8805 (18.1110) lr: 0.000045 max mem: 26307 2022-03-16 22:05:49,113.113 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6285714507102966 2022-03-16 22:05:49,114.114 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 141.56411743164062 2022-03-16 22:05:49,114.114 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.83486683816936 2022-03-16 22:06:07,798.798 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.0206337608397007 2022-03-16 22:06:07,798.798 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 22:06:07,798.798 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'boy', 'watches', '[MASK]', '[MASK]', 'bear', 'chew', 'on', 'a', 'bone', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 22:06:07,814.814 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hand', 'bear', 'shirt', 'eye', 'head', 'ear', 'nose', 'water', 'hair', 'finger', 'boy', 'arm', 'polar', 'person', 'paw', '[UNK]', 'face', 'thumb', 'toy', 'wall', 'fish', 'leg', 'blue', 'animal', 'sleeve', 'mouth', 'rock', 'woman', 'young', 'food', 'ledge', 'child', 'nail', 'bracelet', 'wrist', 'claw', 'ice', 'handle', 'ball', 'bone', 'tree', 'man', 'glasses', 'pool', 'logo', 'little', 'large', 'small', 'snout', 'neck'] 2022-03-16 22:06:23,818.818 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'hand', 'face', 'water', 'hair', 'person', 'arm', 'boy', 'eye', 'shirt', 'animal', 'finger', 'nose', 'ear', 'bear', 'bone', 'thumb', 'toy', 'polar', 'chew'] 2022-03-16 22:08:47,439.439 2829:trainer.py:487 do_train_dict(): eta: 14:03:53 iter: 36700 speed: 286.5 images/sec total_norm: 142.4067 (144.1378) loss: 145.0082 (146.6317) masked_loss: 1.5218 (1.5158) tag_loss: 143.8188 (145.1158) time: 1.4328 (1.7869) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4276 (1.7817) save_time: 8.8805 (18.1110) lr: 0.000045 max mem: 26307 2022-03-16 22:08:47,800.800 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.48571428656578064 2022-03-16 22:08:47,800.800 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 132.3931884765625 2022-03-16 22:08:47,800.800 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.84291374165079 2022-03-16 22:09:06,399.399 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020683057606220245 2022-03-16 22:09:06,399.399 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 22:09:06,400.400 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'dog', 'with', 'a', 'fr', '##is', '##bee', '[MASK]', 'the', 'snow', '##fi', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 22:09:06,415.415 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['snow', 'dog', 'ground', 'tree', 'tail', 'ear', '[UNK]', 'leg', 'head', 'nose', 'eye', 'face', 'shadow', 'mouth', 'paw', 'tag', 'collar', 'glove', 'back', 'jacket', 'hat', 'ski', 'mane', 'foot', 'person', 'tongue', 'fur', 'coat', 'track', 'fence', 'hair', 'snowy', 'line', 'bush', 'spot', 'brown', 'shoe', 'background', 'boot', 'neck', 'arm', 'pole', 'skier', 'hand', 'body', 'teeth', 'branch', 'sky', 'man', 'trunk'] 2022-03-16 22:09:22,318.318 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'face', 'ground', 'eye', 'tree', 'dog', 'nose', 'ear', 'snow', 'tail', 'tag', 'fence', 'paw'] 2022-03-16 22:11:46,075.075 2829:trainer.py:487 do_train_dict(): eta: 14:01:10 iter: 36800 speed: 286.6 images/sec total_norm: 142.6517 (145.1114) loss: 144.3600 (145.3605) masked_loss: 1.4691 (1.5138) tag_loss: 142.9871 (143.8468) time: 1.4329 (1.7864) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4275 (1.7812) save_time: 8.8805 (18.1110) lr: 0.000045 max mem: 26307 2022-03-16 22:11:46,436.436 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5277777910232544 2022-03-16 22:11:46,436.436 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 157.39910888671875 2022-03-16 22:11:46,436.436 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.83789838878766 2022-03-16 22:12:05,043.043 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02067859284579754 2022-03-16 22:12:05,043.043 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 22:12:05,043.043 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'black', 'and', 'gray', 'keyboard', 'and', 'black', 'mouse', 'on', 'a', '[MASK]', '[MASK]', 'carpet', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 22:12:05,059.059 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['mouse', 'keyboard', 'cord', 'floor', 'ground', '[UNK]', 'button', 'computer', 'wire', 'carpet', 'key', 'pad', 'ipod', 'table', 'remote', 'strap', 'logo', 'phone', 'laptop', 'cell', 'black', 'screen', 'paper', 'controller', 'next', 'camera', 'desk', 'surface', 'plug', 'electronic', 'speaker', 'pen', 'box', 'book', 'control', 'monitor', 'light', 'white', 'reflection', 'small', 'leg', 'antenna', 'case', 'circle', 'game', 'handle', 'cable', 'other', 'tag', 'writing'] 2022-03-16 22:12:20,884.884 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'black', 'ground', 'floor', 'surface', 'gray', 'button', 'wire', 'mouse', 'keyboard', 'carpet', 'decorative', 'cord', 'beige'] 03-16 22:13:00.685 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 22:13:00.685 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 22:13:02.196 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 22:14:45,044.044 2829:trainer.py:487 do_train_dict(): eta: 13:58:26 iter: 36900 speed: 286.1 images/sec total_norm: 143.6793 (145.3927) loss: 142.1302 (143.9641) masked_loss: 1.5465 (1.5477) tag_loss: 140.5966 (142.4164) time: 1.4324 (1.7896) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4270 (1.7845) save_time: 8.8805 (18.1110) lr: 0.000044 max mem: 26307 2022-03-16 22:14:45,405.405 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5757575631141663 2022-03-16 22:14:45,405.405 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 141.97787475585938 2022-03-16 22:14:45,406.406 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.85165461978397 2022-03-16 22:15:04,161.161 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020698657259345055 2022-03-16 22:15:04,161.161 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 22:15:04,162.162 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'large', 'tall', '[MASK]', '[MASK]', 'on', 'a', 'road', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 22:15:04,177.177 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'clock', 'floor', 'picture', 'grandfather', 'frame', 'base', 'post', 'carpet', 'wood', 'wooden', 'handle', 'door', 'leg', 'stand', 'hand', 'sword', 'panel', 'pole', 'face', 'shadow', 'table', 'plate', 'rug', '[UNK]', 'gun', 'front', 'cabinet', 'room', 'old', 'top', 'book', 'number', 'painting', 'chair', 'rope', 'furniture', 'light', 'holder', 'sign', 'flower', 'mat', 'foot', 'mantle', 'paper', 'next', 'window', 'large', 'antique', 'woman'] 2022-03-16 22:15:20,087.087 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['large', 'door', 'road', 'post', 'floor', 'wall', 'base', 'stand', 'gun', 'metal', 'picture', 'tall', 'frame', 'handle', 'clock', 'shadow', 'panel', 'carpet'] 2022-03-16 22:17:44,077.077 2829:trainer.py:487 do_train_dict(): eta: 13:55:42 iter: 37000 speed: 286.0 images/sec total_norm: 144.2526 (146.3389) loss: 143.5280 (144.3948) masked_loss: 1.4919 (1.5015) tag_loss: 141.9779 (142.8932) time: 1.4323 (1.7904) data: 0.0001 (0.0001) to_device: 0.0051 (0.0051) time_gpu: 1.4270 (1.7852) save_time: 8.8805 (18.1110) lr: 0.000044 max mem: 26307 2022-03-16 22:17:44,437.437 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6060606241226196 2022-03-16 22:17:44,438.438 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 126.27552032470703 2022-03-16 22:17:44,438.438 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.8635114376757 2022-03-16 22:18:03,171.171 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020698022097349167 2022-03-16 22:18:03,171.171 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 22:18:03,172.172 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'four', '[MASK]', 'standing', '[MASK]', 'a', 'sidewalk', 'in', 'a', 'city', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 22:18:03,187.187 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['man', '[UNK]', 'jacket', 'pole', 'building', 'person', 'shoe', 'sidewalk', 'bag', 'woman', 'hair', 'wheel', 'street', 'ground', 'sign', 'purse', 'suit', 'window', 'bench', 'post', 'light', 'coat', 'jean', 'shirt', 'leg', 'sky', 'city', 'tree', 'cover', 'bike', 'bicycle', 'head', 'railing', 'wall', 'fence', 'lamp', 'hand', 'hat', 'clock', 'store', 'board', 'base', 'roof', 'boot', 'flag', 'face', 'car', 'road', 'pipe', 'backpack'] 2022-03-16 22:18:19,046.046 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'city', 'head', 'man', 'building', 'street', 'woman', 'ground', 'board', 'hair', 'person', 'wall', 'cover', 'window', 'box', 'store', 'sign', 'jean', 'shirt', 'leg', 'bag', 'suit', 'wheel', 'coat', 'pole', 'jacket', 'bike', 'purse', 'shoe', 'sidewalk', 'railing', 'sunglasses', 'graffiti'] 2022-03-16 22:20:43,231.231 2829:trainer.py:487 do_train_dict(): eta: 13:52:59 iter: 37100 speed: 285.8 images/sec total_norm: 142.5453 (146.6679) loss: 142.8995 (144.3784) masked_loss: 1.4497 (1.5295) tag_loss: 141.3751 (142.8489) time: 1.4335 (1.7915) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4286 (1.7864) save_time: 8.8805 (18.1110) lr: 0.000044 max mem: 26307 2022-03-16 22:20:43,590.590 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7352941036224365 2022-03-16 22:20:43,590.590 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 152.84902954101562 2022-03-16 22:20:43,591.591 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.86224854377008 2022-03-16 22:21:02,557.557 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.0207029040902853 2022-03-16 22:21:02,557.557 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 22:21:02,557.557 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'group', 'of', 'sheer', '##ed', 'sheep', 'hu', '[MASK]', 'in', 'a', 'group', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 22:21:02,573.573 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'sheep', 'fence', 'ground', 'dirt', 'head', 'gate', 'field', 'ear', 'face', 'herd', 'post', 'grass', 'leg', '[UNK]', 'leaf', 'trunk', 'pen', 'group', 'pole', 'area', 'nose', 'tail', 'animal', 'branch', 'farm', 'dog', 'net', 'stand', 'flock', 'cow', 'enclosure', 'bush', 'bunch', 'black', 'plant', 'hay', 'other', 'rock', 'mud', 'metal', 'tag', 'flower', 'horn', 'lamb', 'horse', 'next', 'patch', 'shirt', 'grazing'] 2022-03-16 22:21:18,601.601 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'group', 'face', 'field', 'ground', 'post', 'tree', 'leg', 'ear', 'gate', 'pole', 'dirt', 'sheep', 'fence', 'hay', 'herd'] 2022-03-16 22:23:42,118.118 2829:trainer.py:487 do_train_dict(): eta: 13:50:15 iter: 37200 speed: 286.2 images/sec total_norm: 143.1843 (146.9494) loss: 143.5977 (144.4710) masked_loss: 1.4802 (1.5095) tag_loss: 141.9933 (142.9615) time: 1.4325 (1.7888) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4273 (1.7837) save_time: 8.8805 (18.1110) lr: 0.000044 max mem: 26307 2022-03-16 22:23:42,479.479 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6060606241226196 2022-03-16 22:23:42,479.479 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 156.52777099609375 2022-03-16 22:23:42,479.479 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.85706361049621 2022-03-16 22:24:01,491.491 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020694421604275703 2022-03-16 22:24:01,491.491 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 22:24:01,492.492 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'man', 'dive', '[MASK]', 'to', 'catch', 'a', 'fr', '##is', '##be', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 22:24:01,507.507 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'grass', 'shirt', 'man', 'hair', 'head', 'hand', '[UNK]', 'arm', 'ring', 'beach', 'person', 'boy', 'leg', 'ground', 'sand', 'short', 'face', 'foot', 'water', 'circle', 'woman', 'green', 'child', 'logo', 'kite', 'field', 'board', 'disc', 'ear', 'pole', 'post', 'sleeve', 'mouth', 'air', 'bush', 'watch', 'jean', 'blue', 'dirt', 'design', 'ocean', 'shoe', 'hill', 'wrist', 'couple', 'shore', 'young', 'girl', 'top'] 2022-03-16 22:24:17,388.388 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'water', 'woman', 'ground', 'hair', 'person', 'arm', 'foot', 'beach', 'ring', 'sky', 'shirt', 'leg', 'sand', 'grass'] 2022-03-16 22:26:41,339.339 2829:trainer.py:487 do_train_dict(): eta: 13:47:31 iter: 37300 speed: 285.7 images/sec total_norm: 142.1692 (145.5851) loss: 147.5261 (147.6702) masked_loss: 1.5053 (1.5126) tag_loss: 145.8689 (146.1576) time: 1.4332 (1.7922) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4281 (1.7870) save_time: 8.8805 (18.1110) lr: 0.000044 max mem: 26307 2022-03-16 22:26:41,700.700 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-16 22:26:41,701.701 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 129.5003204345703 2022-03-16 22:26:41,701.701 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.8602417012587 2022-03-16 22:27:00,623.623 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020728370174765587 2022-03-16 22:27:00,623.623 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 22:27:00,624.624 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'the', 'suitcase', '##s', 'are', '[MASK]', 'to', 'be', 'picked', 'up', 'at', 'the', 'counter', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 22:27:00,639.639 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['suitcase', 'luggage', 'floor', 'carpet', 'bag', 'tag', 'handle', 'airport', '[UNK]', 'backpack', 'jacket', 'sign', 'wheel', 'ceiling', 'light', 'shirt', 'person', 'man', 'paper', 'column', 'pillar', 'strap', 'ground', 'cart', 'wall', 'case', 'woman', 'zipper', 'poster', 'board', 'purse', 'railing', 'hair', 'claim', 'blue', 'pole', 'coat', 'building', 'display', 'logo', 'jean', 'door', 'belt', 'baggage', 'lobby', 'hand', 'bench', 'window', 'box', 'wheelchair'] 2022-03-16 22:27:16,537.537 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'board', 'floor', 'wall', 'airport', 'sign', 'bag', 'counter', 'handle', 'wheel', 'ceiling', 'column', 'tag', 'jacket', 'carpet', 'poster', 'pillar', 'suitcase', 'luggage', 'zipper'] 2022-03-16 22:29:40,502.502 2829:trainer.py:487 do_train_dict(): eta: 13:44:47 iter: 37400 speed: 285.8 images/sec total_norm: 142.8483 (144.7200) loss: 142.6678 (143.3559) masked_loss: 1.5306 (1.5459) tag_loss: 141.3286 (141.8100) time: 1.4332 (1.7917) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4280 (1.7865) save_time: 8.8805 (18.1110) lr: 0.000044 max mem: 26307 2022-03-16 22:29:40,862.862 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5757575631141663 2022-03-16 22:29:40,862.862 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 153.02297973632812 2022-03-16 22:29:40,862.862 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.85464694213867 2022-03-16 22:29:59,918.918 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020721744745969772 2022-03-16 22:29:59,918.918 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 22:29:59,919.919 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'beautiful', 'green', 'vase', 'is', 'on', 'display', '[MASK]', 'a', 'cabinet', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 22:29:59,934.934 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'vase', 'shadow', 'table', 'flower', 'base', 'design', 'top', 'handle', 'shelf', 'picture', 'display', 'floor', 'door', 'white', 'rim', 'neck', 'stand', 'blue', 'mirror', 'green', 'bowl', 'lid', '[UNK]', 'cloth', 'jar', 'frame', 'glass', 'reflection', 'leaf', 'light', 'pot', 'decorative', 'outlet', 'bottom', 'paper', 'small', 'sign', 'room', 'dot', 'container', 'stem', 'next', 'front', 'box', 'window', 'different', 'black', 'large', 'side'] 2022-03-16 22:30:15,775.775 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['door', 'green', 'floor', 'table', 'wall', 'base', 'paper', 'beautiful', 'display', 'shadow', 'cabinet', 'mirror', 'flower', 'rim', 'vase'] 2022-03-16 22:32:39,814.814 2829:trainer.py:487 do_train_dict(): eta: 13:42:04 iter: 37500 speed: 285.5 images/sec total_norm: 145.4215 (150.6948) loss: 145.9200 (145.5019) masked_loss: 1.5618 (1.5809) tag_loss: 144.4152 (143.9209) time: 1.4334 (1.7931) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4281 (1.7879) save_time: 8.8805 (18.1110) lr: 0.000044 max mem: 26307 2022-03-16 22:32:40,180.180 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7575757503509521 2022-03-16 22:32:40,180.180 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 123.19025421142578 2022-03-16 22:32:40,180.180 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.85614838498704 2022-03-16 22:32:59,371.371 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02071467787027359 2022-03-16 22:32:59,371.371 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 22:32:59,372.372 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'pizza', '##s', 'in', 'delivery', 'boxes', 'are', 'on', 'the', 'table', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 22:32:59,387.387 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['pizza', 'table', 'box', 'bag', '[UNK]', 'lid', 'reflection', 'cheese', 'topping', 'bottle', 'crust', 'slice', 'glass', 'person', 'pepper', 'floor', 'hand', 'tomato', 'food', 'mushroom', 'paper', 'label', 'top', 'bowl', 'light', 'napkin', 'chair', 'counter', 'plate', 'water', 'wall', 'shrimp', 'cardboard', 'handle', 'cup', 'onion', 'next', 'different', 'container', 'meat', 'spoon', 'fork', 'shirt', 'sausage', 'phone', 'large', 'shadow', 'spot', 'soda', 'wooden'] 2022-03-16 22:33:15,364.364 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'table', 'box', 'bag', 'object', 'delivery', 'cheese', 'reflection', 'pizza', 'pepper', 'lid', 'mushroom', 'crust', 'tomato', 'topping', 'pea'] 2022-03-16 22:35:39,107.107 2829:trainer.py:487 do_train_dict(): eta: 13:39:20 iter: 37600 speed: 285.6 images/sec total_norm: 142.8860 (146.0319) loss: 142.9943 (144.7085) masked_loss: 1.5040 (1.5274) tag_loss: 141.8533 (143.1811) time: 1.4324 (1.7929) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4272 (1.7877) save_time: 8.8805 (18.1110) lr: 0.000043 max mem: 26307 2022-03-16 22:35:39,468.468 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6764705777168274 2022-03-16 22:35:39,468.468 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 164.0895538330078 2022-03-16 22:35:39,468.468 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.85329472670821 2022-03-16 22:35:58,510.510 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02071584016084671 2022-03-16 22:35:58,510.510 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 22:35:58,510.510 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'flock', 'of', 'sheer', '[MASK]', 'sheep', 'in', 'a', 'field', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 22:35:58,526.526 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['grass', 'sheep', 'leg', 'head', 'field', 'ear', 'face', 'tail', 'fence', 'nose', 'bush', 'lamb', '[UNK]', 'grassy', 'spot', 'green', 'eye', 'mouth', 'wool', 'post', 'pasture', 'white', 'herd', 'animal', 'tree', 'lush', 'group', 'plant', 'bird', 'grazing', 'standing', 'meadow', 'couple', 'background', 'open', 'top', 'foot', 'black', 'weed', 'other', 'goat', 'road', 'neck', 'dog', 'large', 'pole', 'small', 'next', 'walking', 'horn'] 2022-03-16 22:36:14,433.433 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'back', 'head', 'face', 'field', 'leg', 'ear', 'grass', 'sheep', 'fence', 'lamb', 'flock'] 2022-03-16 22:38:38,502.502 2829:trainer.py:487 do_train_dict(): eta: 13:36:36 iter: 37700 speed: 285.4 images/sec total_norm: 141.9469 (144.5396) loss: 139.9071 (141.3757) masked_loss: 1.4578 (1.4852) tag_loss: 138.2680 (139.8905) time: 1.4324 (1.7940) data: 0.0001 (0.0005) to_device: 0.0050 (0.0049) time_gpu: 1.4274 (1.7886) save_time: 8.8805 (18.1110) lr: 0.000043 max mem: 26307 2022-03-16 22:38:38,862.862 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6969696879386902 2022-03-16 22:38:38,863.863 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 153.47189331054688 2022-03-16 22:38:38,863.863 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.84930174691337 2022-03-16 22:38:58,099.099 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020729536190629005 2022-03-16 22:38:58,099.099 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 22:38:58,100.100 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'baseball', '[MASK]', 'is', 'hitting', 'the', 'ball', 'with', 'his', 'bat', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 22:38:58,115.115 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['man', 'belt', 'glove', '[UNK]', 'shirt', 'baseball', 'head', 'player', 'stripe', 'bat', 'uniform', 'jersey', 'arm', 'helmet', 'logo', 'hat', 'hand', 'fence', 'person', 'wall', 'face', 'sleeve', 'band', 'name', 'cap', 'number', 'tree', 'ear', 'field', 'stadium', 'dirt', 'nose', 'grass', 'ball', 'sky', 'letter', 'base', 'background', 'leg', 'buckle', 'stand', 'crowd', 'sign', 'pole', 'shoe', 'writing', 'hair', 'mouth', 'beard', 'batter'] 2022-03-16 22:39:14,048.048 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'name', 'player', 'person', 'wall', 'arm', 'stand', 'baseball', 'ball', 'sign', 'shirt', 'jersey', 'belt', 'cap', 'uniform', 'bat', 'logo', 'sleeve', 'helmet', 'glove', 'stripe', 'spectator'] 2022-03-16 22:41:37,696.696 2829:trainer.py:487 do_train_dict(): eta: 13:33:52 iter: 37800 speed: 285.7 images/sec total_norm: 145.2930 (146.5777) loss: 143.3892 (144.4144) masked_loss: 1.4745 (1.5224) tag_loss: 142.1373 (142.8920) time: 1.4326 (1.7920) data: 0.0001 (0.0001) to_device: 0.0051 (0.0051) time_gpu: 1.4275 (1.7867) save_time: 8.8805 (18.1110) lr: 0.000043 max mem: 26307 2022-03-16 22:41:38,059.059 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6111111044883728 2022-03-16 22:41:38,059.059 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 118.5678482055664 2022-03-16 22:41:38,060.060 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.85469861395441 2022-03-16 22:41:57,362.362 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020766209810972214 2022-03-16 22:41:57,362.362 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 22:41:57,363.363 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'salad', 'mixed', '##ச', '[MASK]', ',', 'bro', '##cco', '##li', ',', 'and', 'other', 'items', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 22:41:57,378.378 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tomato', '[UNK]', 'olive', 'pepper', 'bowl', 'salad', 'chicken', 'pasta', 'pan', 'mushroom', 'carrot', 'shrimp', 'food', 'plate', 'pot', 'bean', 'meat', 'table', 'cheese', 'vegetable', 'rice', 'handle', 'dish', 'rim', 'potato', 'corn', 'mixed', 'cherry', 'black', 'spoon', 'full', 'pea', 'stove', 'bread', 'pizza', 'lemon', 'different', 'background', 'top', 'many', 'fruit', 'stir', 'meal', 'fry', 'napkin', 'wall', 'close', 'stem', 'picture', 'red'] 2022-03-16 22:42:13,315.315 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'other', 'mixed', 'bowl', 'plate', 'pan', 'olive', 'corn', 'pepper', 'bean', 'lemon', 'salad', 'shrimp', 'tomato', 'pasta', 'carrot'] 03-16 22:43:02.278 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 22:43:02.278 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 22:43:03.319 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 22:44:37,040.040 2829:trainer.py:487 do_train_dict(): eta: 13:31:07 iter: 37900 speed: 285.5 images/sec total_norm: 141.8918 (145.0829) loss: 145.2153 (144.1030) masked_loss: 1.4875 (1.5336) tag_loss: 143.7569 (142.5694) time: 1.4323 (1.7934) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4270 (1.7882) save_time: 8.8805 (18.1110) lr: 0.000043 max mem: 26307 2022-03-16 22:44:37,402.402 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.59375 2022-03-16 22:44:37,402.402 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 156.47735595703125 2022-03-16 22:44:37,402.402 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.85262803529439 2022-03-16 22:44:56,648.648 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020804809406399727 2022-03-16 22:44:56,649.649 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 22:44:56,649.649 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'walking', 'in', 'between', 'some', 'trees', 'in', 'a', 'field', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 22:44:56,665.665 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['grass', 'building', 'window', 'rock', 'log', 'balcony', 'ground', 'tree', 'path', 'animal', 'railing', 'bush', 'field', '[UNK]', 'trunk', 'head', 'pole', 'fence', 'branch', 'wood', 'ear', 'nose', 'cow', 'roof', 'post', 'dirt', 'road', 'deer', 'front', 'leg', 'house', 'door', 'horn', 'bear', 'second', 'tall', 'neck', 'next', 'large', 'elephant', 'tail', 'boulder', 'background', 'grassy', 'wall', 'bird', 'curtain', 'zebra', 'pathway', 'stone'] 2022-03-16 22:45:12,563.563 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'building', 'field', 'ground', 'rock', 'window', 'tree', 'wood', 'animal', 'path', 'leg', 'nose', 'palm', 'grass', 'trunk', 'log', 'balcony', 'zebra'] 2022-03-16 22:47:36,287.287 2829:trainer.py:487 do_train_dict(): eta: 13:28:23 iter: 38000 speed: 285.6 images/sec total_norm: 143.5322 (146.8267) loss: 139.6204 (143.0550) masked_loss: 1.4604 (1.5004) tag_loss: 137.9492 (141.5546) time: 1.4321 (1.7925) data: 0.0001 (0.0001) to_device: 0.0052 (0.0051) time_gpu: 1.4268 (1.7872) save_time: 8.8805 (18.1110) lr: 0.000043 max mem: 26307 2022-03-16 22:47:36,647.647 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7058823704719543 2022-03-16 22:47:36,648.648 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 132.0045166015625 2022-03-16 22:47:36,648.648 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.87058102865545 2022-03-16 22:47:56,062.062 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02083042450249195 2022-03-16 22:47:56,062.062 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 22:47:56,063.063 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'horse', 'graz', '##es', 'for', 'grass', '[MASK]', 'a', 'plain', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 22:47:56,078.078 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['head', 'sky', 'leg', 'bush', 'tree', 'horse', 'field', 'grass', 'tail', 'ear', 'neck', 'mane', 'cloud', 'ground', 'shadow', 'face', '[UNK]', 'nose', 'water', 'patch', 'standing', 'body', 'distance', 'eye', 'mouth', 'open', 'grazing', 'animal', 'grassy', 'hair', 'brown', 'dirt', 'large', 'building', 'white', 'area', 'next', 'house', 'pole', 'front', 'puddle', 'black', 'day', 'wild', 'middle', 'top', 'background', 'spot', 'plant', 'hill'] 2022-03-16 22:48:12,054.054 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'face', 'field', 'ground', 'neck', 'tree', 'horse', 'sky', 'leg', 'ear', 'shadow', 'grass', 'tail', 'bush', 'plain', 'cloud', 'mane'] 2022-03-16 22:50:35,651.651 2829:trainer.py:487 do_train_dict(): eta: 13:25:39 iter: 38100 speed: 285.5 images/sec total_norm: 144.8026 (147.0265) loss: 141.8485 (143.8578) masked_loss: 1.4973 (1.5103) tag_loss: 140.5169 (142.3475) time: 1.4327 (1.7937) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4275 (1.7885) save_time: 8.8805 (18.1110) lr: 0.000043 max mem: 26307 2022-03-16 22:50:36,011.011 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4571428596973419 2022-03-16 22:50:36,012.012 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 142.73191833496094 2022-03-16 22:50:36,012.012 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.87913879674143 2022-03-16 22:50:55,511.511 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020914802327752113 2022-03-16 22:50:55,512.512 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 22:50:55,512.512 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'clock', 'in', 'middle', 'of', 'a', 'sculpture', '[MASK]', 'top', '[MASK]', 'building', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 22:50:55,527.527 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'hand', 'building', 'statue', 'clock', 'head', 'sculpture', 'face', 'wall', 'man', 'crown', 'lion', 'fence', 'hair', 'number', '[UNK]', 'leg', 'gold', 'top', 'ledge', 'horse', 'window', 'wing', 'sun', 'sword', 'large', 'railing', 'design', 'ear', 'column', 'blue', 'pillar', 'fountain', 'pole', 'roman', 'shield', 'frame', 'balcony', 'horn', 'eagle', 'person', 'decoration', 'tail', 'metal', 'side', 'background', 'shadow', 'base', 'flower', 'ornate'] 2022-03-16 22:51:11,456.456 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'building', 'top', 'middle', 'wall', 'sky', 'crown', 'clock', 'brick', 'statue', 'sculpture', 'lion', 'fence'] 2022-03-16 22:53:35,016.016 2829:trainer.py:487 do_train_dict(): eta: 13:22:55 iter: 38200 speed: 285.5 images/sec total_norm: 144.0508 (145.6414) loss: 145.1267 (146.4171) masked_loss: 1.5148 (1.5391) tag_loss: 143.5140 (144.8780) time: 1.4322 (1.7936) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4271 (1.7885) save_time: 8.8805 (18.1110) lr: 0.000042 max mem: 26307 2022-03-16 22:53:35,375.375 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6176470518112183 2022-03-16 22:53:35,375.375 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 143.30453491210938 2022-03-16 22:53:35,376.376 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.88548150324013 2022-03-16 22:53:54,669.669 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.0208915863186121 2022-03-16 22:53:54,669.669 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 22:53:54,669.669 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'bunch', 'of', 'guys', 'playing', '[MASK]', 'pathway', 'a', 'field', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 22:53:54,685.685 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'tree', 'glove', '[UNK]', 'number', 'wall', 'shadow', 'helmet', 'player', 'shoe', 'line', 'man', 'jersey', 'sky', 'field', 'sign', 'uniform', 'pole', 'hat', 'baseball', 'fence', 'head', 'person', 'bat', 'hand', 'grass', 'cap', 'building', 'net', 'window', 'dirt', 'jacket', 'leg', 'base', 'boy', 'umpire', 'cloud', 'mask', 'game', 'back', 'banner', 'ball', 'catcher', 'goal', 'background', 'girl', 'young', 'plate', 'ready', 'team'] 2022-03-16 22:54:10,538.538 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'number', 'line', 'player', 'field', 'person', 'child', 'wall', 'window', 'tree', 'baseball', 'sign', 'sky', 'shirt', 'jersey', 'leg', 'shadow', 'net', 'cap', 'uniform', 'pole', 'jacket', 'dirt', 'bat', 'fence', 'collar', 'bunch', 'helmet', 'shoe', 'glove', 'stripe'] 2022-03-16 22:56:34,656.656 2829:trainer.py:487 do_train_dict(): eta: 13:20:10 iter: 38300 speed: 285.0 images/sec total_norm: 145.9730 (147.5817) loss: 145.0234 (143.9718) masked_loss: 1.5027 (1.5331) tag_loss: 143.7574 (142.4387) time: 1.4336 (1.7965) data: 0.0001 (0.0001) to_device: 0.0051 (0.0049) time_gpu: 1.4286 (1.7914) save_time: 8.8805 (18.1110) lr: 0.000042 max mem: 26307 2022-03-16 22:56:35,017.017 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.48571428656578064 2022-03-16 22:56:35,017.017 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 134.02793884277344 2022-03-16 22:56:35,018.018 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.88664351900418 2022-03-16 22:56:54,676.676 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02089565619826317 2022-03-16 22:56:54,676.676 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 22:56:54,677.677 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'laying', 'in', 'bed', 'next', 'to', '[MASK]', 'dog', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 22:56:54,692.692 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['head', 'hand', 'face', 'dog', 'arm', 'hair', 'shirt', 'person', 'man', 'eye', 'bear', 'nose', 'ear', 'animal', 'picture', 'woman', '[UNK]', 'chair', 'cat', 'boy', 'paw', 'wall', 'mouth', 'couch', 'stuffed', 'blanket', 'teddy', 'pillow', 'finger', 'floor', 'collar', 'leg', 'mirror', 'girl', 'hat', 'glasses', 'window', 'table', 'child', 'bow', 'foot', 'photo', 'tail', 'curtain', 'book', 'neck', 'tie', 'phone', 'watch', 'baby'] 2022-03-16 22:57:10,640.640 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'hand', 'face', 'line', 'hair', 'mouth', 'person', 'floor', 'bed', 'arm', 'eye', 'shirt', 'dog', 'spot', 'finger', 'nose', 'ear', 'cheek', 'shadow', 'blanket', 'collar', 'eyebrow', 'beard', 'paw'] 2022-03-16 22:59:34,439.439 2829:trainer.py:487 do_train_dict(): eta: 13:17:26 iter: 38400 speed: 284.8 images/sec total_norm: 146.1122 (147.8157) loss: 143.1011 (145.3901) masked_loss: 1.5180 (1.5304) tag_loss: 141.6687 (143.8597) time: 1.4331 (1.7978) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4278 (1.7926) save_time: 8.8805 (18.1110) lr: 0.000042 max mem: 26307 2022-03-16 22:59:34,800.800 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5714285969734192 2022-03-16 22:59:34,800.800 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 152.5529022216797 2022-03-16 22:59:34,800.800 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.88763969718636 2022-03-16 22:59:54,392.392 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020907152444124222 2022-03-16 22:59:54,392.392 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 22:59:54,392.392 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'harbor', 'full', 'of', 'white', 'boats', '[MASK]', 'a', 'plane', 'in', 'the', 'wheeling', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 22:59:54,408.408 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'cloud', 'boat', 'tree', 'airplane', 'water', 'pole', 'building', '[UNK]', 'canopy', 'dock', 'harbor', 'tail', 'wing', 'tent', 'ground', 'lot', 'car', 'flag', 'post', 'white', 'stripe', 'roof', 'marina', 'large', 'airport', 'plane', 'day', 'cloudy', 'cover', 'background', 'lake', 'person', 'ship', 'blue', 'shore', 'engine', 'window', 'mast', 'pier', 'body', 'number', 'sign', 'reflection', 'parking', 'vehicle', 'grass', 'bridge', 'beach', 'distance'] 2022-03-16 23:00:10,370.370 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['water', 'white', 'full', 'cover', 'window', 'tree', 'sky', 'boat', 'plane', 'cloud', 'harbor', 'pole', 'airplane', 'canopy', 'stripe'] 2022-03-16 23:02:33,882.882 2829:trainer.py:487 do_train_dict(): eta: 13:14:42 iter: 38500 speed: 285.3 images/sec total_norm: 144.9681 (147.2144) loss: 143.9183 (144.7134) masked_loss: 1.4767 (1.5306) tag_loss: 142.4166 (143.1828) time: 1.4326 (1.7944) data: 0.0001 (0.0002) to_device: 0.0049 (0.0048) time_gpu: 1.4274 (1.7894) save_time: 8.8805 (18.1110) lr: 0.000042 max mem: 26307 2022-03-16 23:02:34,243.243 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5625 2022-03-16 23:02:34,243.243 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 137.63296508789062 2022-03-16 23:02:34,243.243 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.88041144455035 2022-03-16 23:02:53,865.865 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020896276459097862 2022-03-16 23:02:53,865.865 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 23:02:53,865.865 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'oakland', 'and', 'vegetables', 'are', 'lying', 'on', 'a', 'bar', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 23:02:53,881.881 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['window', 'table', '[UNK]', 'leaf', 'container', 'fruit', 'banana', 'vegetable', 'apple', 'bag', 'box', 'wall', 'lamp', 'food', 'logo', 'lid', 'bottle', 'pitcher', 'sign', 'counter', 'top', 'plant', 'stem', 'vent', 'basket', 'label', 'desk', 'cabinet', 'onion', 'jug', 'pole', 'squash', 'handle', 'mirror', 'pepper', 'mango', 'tray', 'base', 'bowl', 'tomato', 'television', 'paper', 'drawer', 'building', 'bunch', 'bar', 'monitor', 'pear', 'other', 'picture'] 2022-03-16 23:03:09,901.901 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'television', 'table', 'food', 'window', 'bar', 'box', 'sign', 'bag', 'camera', 'fruit', 'apple', 'leaf', 'stem', 'pitcher', 'lamp', 'cord', 'container', 'banana', 'vegetable'] 2022-03-16 23:05:33,548.548 2829:trainer.py:487 do_train_dict(): eta: 13:11:57 iter: 38600 speed: 285.0 images/sec total_norm: 144.4205 (146.9250) loss: 139.2003 (141.5523) masked_loss: 1.4535 (1.5045) tag_loss: 137.8857 (140.0479) time: 1.4333 (1.7966) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4280 (1.7914) save_time: 8.8805 (18.1110) lr: 0.000042 max mem: 26307 2022-03-16 23:05:33,909.909 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4848484992980957 2022-03-16 23:05:33,909.909 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 144.18856811523438 2022-03-16 23:05:33,909.909 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.88768414440698 2022-03-16 23:05:53,750.750 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.0209027212113142 2022-03-16 23:05:53,750.750 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 23:05:53,751.751 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'woman', '[MASK]', 'her', 'dog', '[MASK]', 'a', '[MASK]', 'down', 'a', 'path', 'in', 'the', 'woods', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 23:05:53,766.766 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hair', 'dog', 'woman', 'jean', 'arm', 'shirt', 'hand', 'tree', 'ground', 'forest', 'path', 'trail', 'tail', 'leg', 'face', 'leash', 'wood', 'bag', 'head', 'plant', 'bush', 'backpack', 'top', '[UNK]', 'shoe', 'dirt', 'necklace', 'tongue', 'nose', 'collar', 'ear', 'shadow', 'leaf', 'harness', 'neck', 'girl', 'paw', 'foot', 'eye', 'lady', 'mouth', 'tank', 'weed', 'strap', 'rock', 'watch', 'grass', 'wooded', 'sky', 'person'] 2022-03-16 23:06:09,774.774 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'hand', 'face', 'woman', 'ground', 'hair', 'arm', 'forest', 'plant', 'walk', 'foot', 'tree', 'wood', 'jean', 'shirt', 'dog', 'path', 'leg', 'tongue', 'trail', 'bag', 'tail', 'bush', 'dirt', 'shoe', 'necklace', 'backpack', 'harness', 'leash'] 2022-03-16 23:08:33,207.207 2829:trainer.py:487 do_train_dict(): eta: 13:09:13 iter: 38700 speed: 285.0 images/sec total_norm: 144.2442 (146.5112) loss: 146.3764 (144.7257) masked_loss: 1.4183 (1.4822) tag_loss: 145.1942 (143.2435) time: 1.4322 (1.7966) data: 0.0001 (0.0001) to_device: 0.0051 (0.0051) time_gpu: 1.4270 (1.7914) save_time: 8.8805 (18.1110) lr: 0.000042 max mem: 26307 2022-03-16 23:08:33,567.567 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5675675868988037 2022-03-16 23:08:33,567.567 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 141.6092529296875 2022-03-16 23:08:33,567.567 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.89115385665107 2022-03-16 23:08:54,133.133 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020916670560836792 2022-03-16 23:08:54,133.133 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 23:08:54,133.133 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'the', 'gate', 'requires', 'a', 'key', 'but', 'it', 'is', 'locked', 'now', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 23:08:54,149.149 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'sign', 'sidewalk', 'car', 'grass', 'bush', 'fence', 'ground', 'street', '[UNK]', 'bench', 'gate', 'trunk', 'park', 'chain', 'tire', 'suv', 'road', 'light', 'window', 'building', 'wall', 'person', 'pole', 'design', 'parking', 'fire', 'flower', 'woman', 'vehicle', 'plant', 'background', 'leaf', 'wheel', 'jacket', 'man', 'post', 'jeep', 'chair', 'letter', 'truck', 'hair', 'railing', 'motorcycle', 'branch', 'lamp', 'sky', 'rack', 'leg', 'word'] 2022-03-16 23:09:10,069.069 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['park', 'car', 'ground', 'wall', 'key', 'tree', 'sign', 'chain', 'gate', 'wheel', 'bush', 'lock', 'leaf', 'trunk', 'fence', 'sidewalk', 'jeep', 'suv', 'leash'] 2022-03-16 23:11:34,046.046 2829:trainer.py:487 do_train_dict(): eta: 13:06:29 iter: 38800 speed: 283.1 images/sec total_norm: 143.0366 (145.6505) loss: 143.0662 (142.6486) masked_loss: 1.4179 (1.4743) tag_loss: 141.6093 (141.1743) time: 1.4321 (1.8084) data: 0.0001 (0.0005) to_device: 0.0051 (0.0051) time_gpu: 1.4267 (1.8028) save_time: 8.8805 (18.1110) lr: 0.000042 max mem: 26307 2022-03-16 23:11:34,408.408 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5454545617103577 2022-03-16 23:11:34,408.408 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 124.4253158569336 2022-03-16 23:11:34,408.408 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.9113605862105 2022-03-16 23:11:54,016.016 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.020928820595145226 2022-03-16 23:11:54,017.017 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 23:11:54,017.017 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'an', 'up', '##cl', '##ose', '[MASK]', 'of', '[MASK]', 'zebra', 'accent', '##ing', 'its', 'stripes', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 23:11:54,032.032 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['stripe', 'zebra', 'close', '[UNK]', 'neck', 'white', 'black', 'side', 'eye', 'line', 'striped', 'other', 'large', 'spot', 'next', 'wall', 'shot', 'face', 'surface', 'red', 'blue', 'shadow', 'head', 'ear', 'front', 'round', 'open', 'brown', 'design', 'view', 'many', 'image', 'area', 'different', 'leg', 'colorful', 'light', 'long', 'pair', 'middle', 'big', 'small', 'green', 'plain', 'picture', 'number', 'row', 'strip', 'pattern', 'group'] 2022-03-16 23:12:09,876.876 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'image', 'stripe', 'zebra'] 03-16 23:13:03.373 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 23:13:03.373 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 23:13:04.544 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 88}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-16 23:14:33,850.850 2829:trainer.py:487 do_train_dict(): eta: 13:03:44 iter: 38900 speed: 284.8 images/sec total_norm: 144.9280 (146.3157) loss: 139.7748 (143.3813) masked_loss: 1.5157 (1.5194) tag_loss: 138.2592 (141.8619) time: 1.4329 (1.7980) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4276 (1.7929) save_time: 8.8805 (18.1110) lr: 0.000041 max mem: 26307 2022-03-16 23:14:34,212.212 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.42424243688583374 2022-03-16 23:14:34,212.212 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 137.31866455078125 2022-03-16 23:14:34,212.212 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.91594096452762 2022-03-16 23:14:54,107.107 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02100761979818344 2022-03-16 23:14:54,107.107 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 23:14:54,108.108 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'pitcher', '[MASK]', 'just', 'finished', '[MASK]', 'a', 'ball', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 23:14:54,123.123 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['grass', 'field', '[UNK]', 'fence', 'head', 'dirt', 'glove', 'pole', 'stripe', 'shirt', 'baseball', 'shoe', 'man', 'sign', 'leg', 'bar', 'uniform', 'jersey', 'hand', 'shadow', 'mound', 'logo', 'player', 'number', 'hat', 'ground', 'ball', 'cap', 'arm', 'tree', 'pitcher', 'post', 'belt', 'game', 'pitch', 'sleeve', 'background', 'letter', 'face', 'ear', 'sock', 'wall', 'young', 'plate', 'boy', 'line', 'back', 'nose', 'base', 'ready'] 2022-03-16 23:15:10,014.014 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'player', 'field', 'ground', 'arm', 'bar', 'tree', 'ball', 'letter', 'sign', 'shirt', 'jersey', 'path', 'leg', 'shadow', 'grass', 'hat', 'uniform', 'pole', 'dirt', 'pitcher', 'fence', 'shoe', 'mound', 'necklace', 'glove', 'stripe'] 2022-03-16 23:17:33,687.687 2829:trainer.py:487 do_train_dict(): eta: 13:01:00 iter: 39000 speed: 284.7 images/sec total_norm: 144.6783 (146.4727) loss: 141.0400 (142.9053) masked_loss: 1.4229 (1.5149) tag_loss: 139.7280 (141.3904) time: 1.4330 (1.7983) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4277 (1.7931) save_time: 8.8805 (18.1110) lr: 0.000041 max mem: 26307 2022-03-16 23:17:34,046.046 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6969696879386902 2022-03-16 23:17:34,047.047 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 137.14022827148438 2022-03-16 23:17:34,047.047 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.92032150112455 2022-03-16 23:17:53,794.794 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021010620519518852 2022-03-16 23:17:53,794.794 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 23:17:53,794.794 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', '[MASK]', 'the', 'air', 'doing', 'a', 'trick', 'on', 'his', '[MASK]', '##board', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 23:17:53,810.810 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'arm', 'man', 'tree', 'short', '[UNK]', 'shoe', 'hand', 'head', 'hair', 'ear', 'wheel', 'leg', 'bracelet', 'watch', 'wrist', 'boy', 'air', 'belt', 'band', 'face', 'board', 'young', 'trick', 'foot', 'logo', 'pocket', 'design', 'sky', 'sleeve', 'background', 'skate', 'knee', 'ground', 'hat', 'nose', 'jumping', 'sunglasses', 'small', 'mid', 'glasses', 'shadow', 'person', 'jump', 'fence', 'helmet', 'stripe', 'mouth', 'grass', 'bush'] 2022-03-16 23:18:09,817.817 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'air', 'short', 'hair', 'design', 'arm', 'tree', 'watch', 'sky', 'shirt', 'leg', 'ear', 'wheel', 'wrist', 'trick', 'shoe'] 2022-03-16 23:20:33,582.582 2829:trainer.py:487 do_train_dict(): eta: 12:58:15 iter: 39100 speed: 284.6 images/sec total_norm: 148.3647 (151.0380) loss: 143.3880 (144.0234) masked_loss: 1.4590 (1.4742) tag_loss: 141.8699 (142.5493) time: 1.4337 (1.7990) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4286 (1.7938) save_time: 8.8805 (18.1110) lr: 0.000041 max mem: 26307 2022-03-16 23:20:33,943.943 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6176470518112183 2022-03-16 23:20:33,944.944 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 138.30784606933594 2022-03-16 23:20:33,944.944 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.92602927344186 2022-03-16 23:20:53,834.834 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021004609763622284 2022-03-16 23:20:53,835.835 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 23:20:53,835.835 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'small', 'pup', '##pies', 'eating', 'dog', 'food', 'out', 'of', '[MASK]', 'large', 'bowl', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 23:20:53,850.850 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['head', 'face', 'dog', 'nose', 'paw', 'ear', 'letter', 'eye', 'writing', 'bucket', 'tail', 'stripe', 'food', 'bean', '[UNK]', 'leg', 'word', 'fur', 'floor', 'barrel', 'puppy', 'bowl', 'white', 'mat', 'back', 'small', 'rim', 'animal', 'cereal', 'can', 'trash', 'pot', 'object', 'plate', 'dish', 'mouth', 'container', 'table', 'mushroom', 'water', 'ground', 'wall', 'pony', 'body', 'front', 'black', 'tire', 'tile', 'lettering', 'spot'] 2022-03-16 23:21:09,674.674 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'face', 'small', 'line', 'large', 'floor', 'food', 'writing', 'eye', 'letter', 'dog', 'leg', 'nose', 'ear', 'bowl', 'tail', 'barrel', 'mat', 'bucket', 'stripe', 'paw'] 2022-03-16 23:23:33,800.800 2829:trainer.py:487 do_train_dict(): eta: 12:55:31 iter: 39200 speed: 284.1 images/sec total_norm: 143.5918 (147.0592) loss: 141.8367 (142.4500) masked_loss: 1.5637 (1.5546) tag_loss: 139.8335 (140.8954) time: 1.4338 (1.8022) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4284 (1.7970) save_time: 8.8805 (18.1110) lr: 0.000041 max mem: 26307 2022-03-16 23:23:34,161.161 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.529411792755127 2022-03-16 23:23:34,162.162 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 164.35208129882812 2022-03-16 23:23:34,162.162 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.92469875927796 2022-03-16 23:23:53,953.953 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021037157624959946 2022-03-16 23:23:53,953.953 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 23:23:53,953.953 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'is', '[MASK]', 'to', 'serve', '[MASK]', 'tennis', 'ball', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 23:23:53,969.969 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', '[UNK]', 'man', 'hand', 'head', 'tennis', 'nose', 'arm', 'hair', 'mouth', 'ear', 'face', 'banner', 'court', 'wall', 'logo', 'fence', 'handle', 'sign', 'eye', 'ball', 'short', 'player', 'stripe', 'cap', 'neck', 'person', 'hat', 'woman', 'necklace', 'uniform', 'net', 'ground', 'letter', 'glasses', 'jersey', 'leg', 'band', 'spectator', 'top', 'flag', 'collar', 'sunglasses', 'chair', 'sleeve', 'line', 'watch', 'grass', 'wrist', 'writing'] 2022-03-16 23:24:09,872.872 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'hand', 'face', 'court', 'short', 'hair', 'mouth', 'person', 'wall', 'arm', 'boy', 'base', 'eye', 'chair', 'ball', 'shirt', 'nose', 'ear', 'tennis', 'uniform', 'banner', 'shoe', 'stripe', 'sock'] 2022-03-16 23:26:33,864.864 2829:trainer.py:487 do_train_dict(): eta: 12:52:46 iter: 39300 speed: 284.3 images/sec total_norm: 145.6378 (148.6445) loss: 145.5119 (143.9531) masked_loss: 1.5402 (1.5516) tag_loss: 143.6169 (142.4016) time: 1.4322 (1.8007) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4271 (1.7956) save_time: 8.8805 (18.1110) lr: 0.000041 max mem: 26307 2022-03-16 23:26:34,225.225 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5757575631141663 2022-03-16 23:26:34,225.225 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 165.55166625976562 2022-03-16 23:26:34,229.229 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.91679331009763 2022-03-16 23:26:54,038.038 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021045707166194916 2022-03-16 23:26:54,038.038 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 23:26:54,038.038 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'the', 'woman', 'is', 'standing', 'in', 'her', 'kitchen', '[MASK]', 'to', 'her', 'small', 'freeze', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 23:26:54,054.054 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'floor', 'short', 'refrigerator', 'hand', 'shirt', 'woman', 'leg', '[UNK]', 'magnet', 'hair', 'foot', 'cord', 'face', 'head', 'outlet', 'flop', 'nose', 'door', 'shoe', 'eye', 'flip', 'kitchen', 'wire', 'switch', 'lady', 'paper', 'top', 'arm', 'smile', 'fridge', 'tile', 'can', 'stripe', 'ear', 'mouth', 'cabinet', 'handle', 'tank', 'glasses', 'lid', 'logo', 'next', 'box', 'ground', 'cup', 'neck', 'man', 'carpet', 'girl'] 2022-03-16 23:27:09,865.865 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'hand', 'face', 'small', 'line', 'next', 'door', 'woman', 'short', 'hair', 'floor', 'wall', 'arm', 'lady', 'foot', 'shirt', 'kitchen', 'leg', 'handle', 'glasses', 'dot', 'flip', 'cord', 'outlet', 'bucket', 'tile', 'jar', 'magnet', 'refrigerator', 'stripe', 'flop'] 2022-03-16 23:29:33,874.874 2829:trainer.py:487 do_train_dict(): eta: 12:50:01 iter: 39400 speed: 284.4 images/sec total_norm: 145.0777 (148.0078) loss: 142.2869 (141.4427) masked_loss: 1.5194 (1.5303) tag_loss: 140.6424 (139.9123) time: 1.4325 (1.8001) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4273 (1.7949) save_time: 8.8805 (18.1110) lr: 0.000041 max mem: 26307 2022-03-16 23:29:34,235.235 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5151515007019043 2022-03-16 23:29:34,235.235 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 124.91006469726562 2022-03-16 23:29:34,236.236 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.92718570564367 2022-03-16 23:29:54,438.438 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02103949338197708 2022-03-16 23:29:54,438.438 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 23:29:54,439.439 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'black', 'motorcycle', 'parked', 'on', 'the', 'grass', 'next', 'to', 'some', 'si', '##los', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 23:29:54,454.454 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'tree', 'road', 'motorcycle', 'grass', 'wall', 'tire', 'bike', 'building', 'bush', 'wheel', '[UNK]', 'light', 'mirror', 'tower', 'seat', 'fence', 'structure', 'crane', 'cloud', 'helmet', 'street', 'pole', 'side', 'field', 'car', 'bridge', 'pipe', 'water', 'sun', 'front', 'track', 'sign', 'dirt', 'next', 'black', 'windshield', 'window', 'city', 'shadow', 'bag', 'background', 'ground', 'large', 'couple', 'curb', 'post', 'sunset', 'leaf', 'exhaust'] 2022-03-16 23:30:10,526.526 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'black', 'building', 'road', 'tree', 'tower', 'sky', 'shadow', 'wheel', 'grass', 'bush', 'bike', 'motorcycle', 'ladder', 'crane', 'windshield'] 2022-03-16 23:32:34,144.144 2829:trainer.py:487 do_train_dict(): eta: 12:47:16 iter: 39500 speed: 284.0 images/sec total_norm: 145.1894 (146.7714) loss: 145.5321 (145.5164) masked_loss: 1.4324 (1.5180) tag_loss: 144.6010 (143.9984) time: 1.4325 (1.8027) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4273 (1.7975) save_time: 8.8805 (18.1110) lr: 0.000041 max mem: 26307 2022-03-16 23:32:34,505.505 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6486486196517944 2022-03-16 23:32:34,505.505 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 152.7017364501953 2022-03-16 23:32:34,505.505 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.92869320301095 2022-03-16 23:32:54,638.638 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021076921373605728 2022-03-16 23:32:54,638.638 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 23:32:54,638.638 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'train', 'engine', 'with', 'train', '[MASK]', 'behind', 'it', '[MASK]', 'riding', 'on', 'a', 'set', 'of', '[MASK]', 'with', 'smoke', 'blowing', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 23:32:54,654.654 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['train', 'smoke', 'car', 'grass', 'window', 'track', 'steam', 'engine', 'wheel', 'gravel', 'tree', 'stream', 'water', 'rock', 'number', 'writing', 'wall', 'bush', 'roof', 'man', '[UNK]', 'red', 'door', 'hill', 'stripe', 'pole', 'top', 'sign', 'fence', 'black', 'person', 'building', 'tank', 'light', 'trunk', 'line', 'conductor', 'logo', 'road', 'toy', 'flower', 'bumper', 'model', 'ladder', 'hillside', 'plant', 'shirt', 'container', 'blue', 'passing'] 2022-03-16 23:33:10,576.576 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['number', 'set', 'light', 'car', 'track', 'person', 'wall', 'engine', 'window', 'train', 'roof', 'wheel', 'stream', 'steam', 'grass', 'smoke', 'bush', 'logo', 'fence'] 2022-03-16 23:35:34,253.253 2829:trainer.py:487 do_train_dict(): eta: 12:44:31 iter: 39600 speed: 284.3 images/sec total_norm: 144.0294 (146.7562) loss: 142.3820 (143.5742) masked_loss: 1.5020 (1.5617) tag_loss: 140.6226 (142.0125) time: 1.4325 (1.8011) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4274 (1.7960) save_time: 8.8805 (18.1110) lr: 0.000040 max mem: 26307 2022-03-16 23:35:34,614.614 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5142857432365417 2022-03-16 23:35:34,614.614 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 135.71539306640625 2022-03-16 23:35:34,615.615 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.93804578997326 2022-03-16 23:35:54,809.809 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02107016183435917 2022-03-16 23:35:54,809.809 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 23:35:54,809.809 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'banana', 'and', 'two', '[MASK]', 'fashioned', 'to', 'resemble', 'a', 'smiling', 'face', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 23:35:54,824.824 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['apple', 'stem', 'table', 'banana', 'shadow', 'fruit', 'spot', 'green', 'top', 'wooden', 'end', 'light', '[UNK]', 'face', 'reflection', 'ripe', 'white', 'orange', 'bunch', 'next', 'smiley', 'red', 'tomato', 'surface', 'close', 'line', 'small', 'counter', 'bowl', 'board', 'black', 'design', 'brown', 'different', 'eye', 'handle', 'half', 'knot', 'full', 'other', 'paper', 'cut', 'sit', 'wood', 'plate', 'many', 'picture', 'group', 'large', 'single'] 2022-03-16 23:36:10,774.774 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['face', 'top', 'table', 'shadow', 'smiling', 'fruit', 'apple', 'stem', 'banana'] 2022-03-16 23:38:34,642.642 2829:trainer.py:487 do_train_dict(): eta: 12:41:47 iter: 39700 speed: 283.8 images/sec total_norm: 144.1105 (146.7088) loss: 141.6339 (144.2381) masked_loss: 1.5692 (1.5496) tag_loss: 139.7788 (142.6884) time: 1.4346 (1.8039) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4294 (1.7988) save_time: 8.8805 (18.1110) lr: 0.000040 max mem: 26307 2022-03-16 23:38:35,002.002 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5142857432365417 2022-03-16 23:38:35,002.002 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 159.22828674316406 2022-03-16 23:38:35,002.002 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.94082688566428 2022-03-16 23:38:55,178.178 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021076759323477745 2022-03-16 23:38:55,178.178 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 23:38:55,179.179 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', '[MASK]', 'on', 'top', 'of', 'a', 'beach', 'under', 'a', '[MASK]', 'sky', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 23:38:55,194.194 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'kite', 'beach', 'man', 'sand', 'ocean', 'water', 'wave', 'cloud', 'person', '[UNK]', 'head', 'shirt', 'jacket', 'horizon', 'leg', 'short', 'shore', 'footprint', 'couple', 'jean', 'board', 'coat', 'bag', 'string', 'arm', 'foot', 'sandy', 'dog', 'tail', 'hair', 'hat', 'cloudy', 'track', 'hill', 'day', 'surf', 'surfer', 'shoe', 'rock', 'boy', 'mountain', 'backpack', 'wind', 'para', 'ground', 'hand', 'group', 'child', 'stick'] 2022-03-16 23:39:11,149.149 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'water', 'top', 'short', 'person', 'beach', 'sky', 'shirt', 'ocean', 'leg', 'wave', 'sand', 'cloud', 'jacket', 'horizon', 'glove', 'kite', 'cloudy'] 2022-03-16 23:41:34,749.749 2829:trainer.py:487 do_train_dict(): eta: 12:39:02 iter: 39800 speed: 284.3 images/sec total_norm: 144.4473 (147.2415) loss: 139.6186 (141.8172) masked_loss: 1.4489 (1.4903) tag_loss: 138.0980 (140.3269) time: 1.4327 (1.8011) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4275 (1.7959) save_time: 8.8805 (18.1110) lr: 0.000040 max mem: 26307 2022-03-16 23:41:35,114.114 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6176470518112183 2022-03-16 23:41:35,114.114 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 118.9981689453125 2022-03-16 23:41:35,115.115 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.94300183556732 2022-03-16 23:41:55,185.185 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02114068530499935 2022-03-16 23:41:55,185.185 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 23:41:55,186.186 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'woman', 'with', 'a', 'plate', 'of', 'food', 'that', 'includes', 'soup', 'and', 'chewing', 'sandwich', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 23:41:55,201.201 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['bowl', 'hand', 'shirt', '[UNK]', 'sandwich', 'soup', 'plate', 'bread', 'table', 'spoon', 'man', 'person', 'cup', 'food', 'tomato', 'nose', 'salad', 'restaurant', 'face', 'hair', 'eye', 'jacket', 'head', 'chair', 'glasses', 'wall', 'glass', 'fork', 'woman', 'straw', 'napkin', 'ear', 'handle', 'finger', 'container', 'watch', 'arm', 'basket', 'mouth', 'picture', 'sunglasses', 'logo', 'phone', 'sauce', 'background', 'pot', 'sweater', 'large', 'design', 'ring'] 2022-03-16 23:42:11,040.040 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'hand', 'woman', 'cup', 'hair', 'mouth', 'person', 'table', 'wall', 'food', 'eye', 'chair', 'plant', 'shirt', 'picture', 'nose', 'bowl', 'restaurant', 'plate', 'jacket', 'bread', 'soup', 'sandwich', 'candle', 'lemon', 'spoon', 'tomato'] 03-16 23:43:04.589 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-16 23:43:04.589 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-16 23:43:05.750 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 88}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 88}] 2022-03-16 23:44:35,197.197 2829:trainer.py:487 do_train_dict(): eta: 12:36:17 iter: 39900 speed: 283.7 images/sec total_norm: 144.5613 (147.8947) loss: 144.5917 (144.3493) masked_loss: 1.4687 (1.5257) tag_loss: 143.1203 (142.8236) time: 1.4343 (1.8045) data: 0.0001 (0.0005) to_device: 0.0051 (0.0050) time_gpu: 1.4292 (1.7990) save_time: 8.8805 (18.1110) lr: 0.000040 max mem: 26307 2022-03-16 23:44:35,557.557 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-16 23:44:35,558.558 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 135.4141387939453 2022-03-16 23:44:35,558.558 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.94843773841858 2022-03-16 23:44:55,715.715 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.0211332980543375 2022-03-16 23:44:55,716.716 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 23:44:55,716.716 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'holding', 'a', '[MASK]', 'board', 'with', 'people', 'walking', 'behind', 'him', 'near', 'a', 'building', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 23:44:55,731.731 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hood', 'man', 'jacket', 'glove', 'building', '[UNK]', 'face', 'person', 'nose', 'hand', 'head', 'letter', 'mouth', 'sky', 'eye', 'hat', 'roof', 'shoe', 'jean', 'sign', 'sidewalk', 'ground', 'coat', 'wall', 'boot', 'board', 'stripe', 'backpack', 'window', 'mustache', 'word', 'bag', 'helmet', 'floor', 'finger', 'street', 'woman', 'tree', 'door', 'boy', 'billboard', 'leg', 'brick', 'patch', 'pole', 'bicycle', 'photo', 'pocket', 'picture', 'front'] 2022-03-16 23:45:11,656.656 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'line', 'building', 'ground', 'board', 'person', 'eye', 'letter', 'sign', 'sky', 'roof', 'nose', 'bag', 'coat', 'hat', 'jacket', 'hood', 'ski', 'helmet', 'shoe', 'sidewalk', 'tire', 'glove', 'hose', 'mustache'] 2022-03-16 23:47:35,975.975 2829:trainer.py:487 do_train_dict(): eta: 12:33:32 iter: 40000 speed: 283.2 images/sec total_norm: 146.1526 (149.3473) loss: 140.3300 (143.2892) masked_loss: 1.5450 (1.5433) tag_loss: 139.1518 (141.7459) time: 1.4338 (1.8077) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4285 (1.8026) save_time: 8.8805 (18.1110) lr: 0.000040 max mem: 26307 2022-03-16 23:47:35,977.977 2829:checkpoint.py:222 save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt 2022-03-16 23:47:45,490.490 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-16 23:47:45,490.490 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 164.34422302246094 2022-03-16 23:47:45,491.491 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.95160658876794 2022-03-16 23:48:05,848.848 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021165965124964714 2022-03-16 23:48:05,849.849 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 23:48:05,849.849 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'sign', 'that', 'has', 'some', 'ice', 'hanging', '[MASK]', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 23:48:05,864.864 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['building', 'window', 'sign', 'sky', 'snow', 'wall', 'roof', 'pole', 'ice', '[UNK]', 'post', 'fence', 'brick', 'door', 'street', 'snowy', 'tree', 'side', 'letter', 'top', 'stop', 'design', 'line', 'cloud', 'corner', 'chimney', 'tall', 'person', 'image', 'ledge', 'antenna', 'graffiti', 'front', 'large', 'city', 'paint', 'frame', 'next', 'white', 'board', 'blue', 'wood', 'wire', 'covered', 'light', 'bunch', 'structure', 'arrow', 'water', 'couple'] 2022-03-16 23:48:21,665.665 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['building', 'wall', 'ice', 'window', 'sign', 'sky', 'roof', 'snow'] 2022-03-16 23:50:44,614.614 2829:trainer.py:487 do_train_dict(): eta: 12:30:52 iter: 40100 speed: 271.4 images/sec total_norm: 143.2763 (146.1242) loss: 145.2726 (144.4033) masked_loss: 1.4612 (1.4940) tag_loss: 143.9406 (142.9093) time: 1.4333 (1.8864) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4280 (1.7898) save_time: 8.8805 (16.9902) lr: 0.000040 max mem: 26307 2022-03-16 23:50:44,976.976 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5757575631141663 2022-03-16 23:50:44,976.976 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 105.00901794433594 2022-03-16 23:50:44,976.976 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.96004895072672 2022-03-16 23:51:05,456.456 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021176688373088837 2022-03-16 23:51:05,456.456 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 23:51:05,457.457 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'different', '[MASK]', 'of', 'animals', 'grazing', '[MASK]', 'a', 'field', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 23:51:05,472.472 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hill', 'zebra', 'field', 'sky', 'head', 'bush', 'grass', 'animal', 'tree', 'tail', 'herd', 'cow', '[UNK]', 'horn', 'mane', 'leg', 'buffalo', 'group', 'stripe', 'hillside', 'shadow', 'plain', 'ear', 'grassy', 'cloud', 'horse', 'wild', 'open', 'bird', 'horizon', 'bunch', 'other', 'many', 'green', 'nose', 'dry', 'large', 'goat', 'number', 'tall', 'top', 'day', 'grazing', 'elephant', 'couple', 'savannah', 'sunny', 'next', 'face', 'few'] 2022-03-16 23:51:21,350.350 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'different', 'field', 'hill', 'tree', 'sky', 'animal', 'grass', 'tail', 'bush', 'plain', 'herd', 'mane', 'zebra'] 2022-03-16 23:53:45,064.064 2829:trainer.py:487 do_train_dict(): eta: 12:28:07 iter: 40200 speed: 283.7 images/sec total_norm: 143.9652 (148.0750) loss: 140.2942 (140.5746) masked_loss: 1.5093 (1.5210) tag_loss: 138.3478 (139.0536) time: 1.4329 (1.8046) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4279 (1.7994) save_time: 8.8805 (16.9902) lr: 0.000039 max mem: 26307 2022-03-16 23:53:45,427.427 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6666666865348816 2022-03-16 23:53:45,428.428 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 143.53134155273438 2022-03-16 23:53:45,428.428 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.97026062958294 2022-03-16 23:54:05,807.807 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021170616149902344 2022-03-16 23:54:05,808.808 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 23:54:05,808.808 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'riding', 'a', 'skate', '##board', '[MASK]', 'a', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 23:54:05,823.823 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['road', 'line', 'street', 'building', '[UNK]', 'sidewalk', 'ground', 'pole', 'curb', 'sign', 'sky', 'tree', 'wall', 'window', 'shoe', 'step', 'man', 'shirt', 'shadow', 'light', 'car', 'head', 'door', 'bush', 'wheel', 'leg', 'stair', 'person', 'hand', 'fence', 'jean', 'tire', 'house', 'post', 'fire', 'hair', 'roof', 'railing', 'grass', 'letter', 'truck', 'face', 'traffic', 'hat', 'boy', 'short', 'city', 'arm', 'wire', 'stop'] 2022-03-16 23:54:21,791.791 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'line', 'building', 'door', 'road', 'ground', 'wall', 'arm', 'boy', 'window', 'step', 'letter', 'sign', 'jean', 'shirt', 'leg', 'wheel', 'hat', 'cap', 'reflection', 'shoe', 'sidewalk', 'curb'] 2022-03-16 23:56:45,551.551 2829:trainer.py:487 do_train_dict(): eta: 12:25:22 iter: 40300 speed: 283.7 images/sec total_norm: 144.4516 (146.2000) loss: 139.0191 (140.5680) masked_loss: 1.4820 (1.5087) tag_loss: 137.3365 (139.0593) time: 1.4330 (1.8048) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4281 (1.7998) save_time: 8.8805 (16.9902) lr: 0.000039 max mem: 26307 2022-03-16 23:56:45,912.912 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5833333134651184 2022-03-16 23:56:45,913.913 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 161.34767150878906 2022-03-16 23:56:45,913.913 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.97349332110717 2022-03-16 23:57:06,432.432 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02123328112065792 2022-03-16 23:57:06,432.432 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-16 23:57:06,433.433 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'bear', 'rolling', 'on', 'his', 'back', 'on', 'some', 'logs', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-16 23:57:06,449.449 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['bear', 'rock', 'log', 'head', 'ear', 'grass', 'nose', 'bush', 'branch', 'ground', 'tree', 'leaf', 'plant', 'animal', 'trunk', 'leg', 'paw', 'snout', 'eye', 'fur', 'face', 'mouth', 'black', 'wood', 'tail', 'dirt', 'cat', 'wall', 'large', 'brown', 'back', 'cub', 'field', 'stick', 'forest', 'zoo', 'enclosure', 'claw', 'weed', 'neck', '[UNK]', 'knot', 'fence', 'stone', 'tongue', 'pole', 'flower', 'water', 'dog', 'hole'] 2022-03-16 23:57:22,334.334 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['back', 'head', 'ground', 'rock', 'wall', 'plant', 'tree', 'branch', 'animal', 'leg', 'nose', 'ear', 'bear', 'cat', 'rolling', 'grass', 'bush', 'fur', 'leaf', 'trunk', 'log', 'zoo', 'claw', 'paw'] 2022-03-16 23:59:46,400.400 2829:trainer.py:487 do_train_dict(): eta: 12:22:37 iter: 40400 speed: 283.1 images/sec total_norm: 146.2662 (148.4430) loss: 146.1528 (145.9936) masked_loss: 1.4482 (1.4997) tag_loss: 144.6334 (144.4938) time: 1.4334 (1.8085) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4282 (1.8033) save_time: 8.8805 (16.9902) lr: 0.000039 max mem: 26307 2022-03-16 23:59:46,762.762 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5714285969734192 2022-03-16 23:59:46,762.762 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 151.17861938476562 2022-03-16 23:59:46,762.762 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.97270175321603 2022-03-17 00:00:07,373.373 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021215716376900673 2022-03-17 00:00:07,373.373 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 00:00:07,373.373 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'little', 'girl', 'in', 'a', 'pink', 'and', '[MASK]', 'dress', '[MASK]', 'her', 'arm', '[MASK]', 'and', 'a', 'kite', 'flies', 'in', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 00:00:07,389.389 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'kite', 'water', 'dress', 'arm', 'grass', 'flower', 'girl', 'hair', 'boat', 'hand', 'child', 'building', 'tail', 'string', 'dirt', 'ground', 'tree', 'hill', 'city', '[UNK]', 'house', 'sand', 'beach', 'shadow', 'handle', 'sidewalk', 'head', 'little', 'young', 'leg', 'path', 'person', 'woman', 'bracelet', 'ribbon', 'bow', 'skirt', 'watch', 'shore', 'baby', 'elbow', 'bush', 'lake', 'body', 'ball', 'weed', 'plant', 'wrist', 'field'] 2022-03-17 00:00:23,395.395 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'city', 'house', 'hand', 'little', 'water', 'building', 'hair', 'girl', 'blue', 'green', 'child', 'arm', 'tree', 'beach', 'sky', 'boat', 'dress', 'handle', 'pink', 'string', 'sand', 'grass', 'tail', 'bush', 'flower', 'dirt', 'skirt', 'sidewalk', 'kite'] 2022-03-17 00:02:47,006.006 2829:trainer.py:487 do_train_dict(): eta: 12:19:51 iter: 40500 speed: 283.5 images/sec total_norm: 147.2135 (150.0210) loss: 144.4520 (145.5894) masked_loss: 1.4516 (1.4931) tag_loss: 143.0449 (144.0964) time: 1.4330 (1.8061) data: 0.0001 (0.0001) to_device: 0.0051 (0.0049) time_gpu: 1.4277 (1.8010) save_time: 8.8805 (16.9902) lr: 0.000039 max mem: 26307 2022-03-17 00:02:47,366.366 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-17 00:02:47,367.367 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 148.04547119140625 2022-03-17 00:02:47,367.367 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.96762747365266 2022-03-17 00:03:07,886.886 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021214559674263 2022-03-17 00:03:07,886.886 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 00:03:07,886.886 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'table', 'with', 'some', 'bananas', '[MASK]', 'pick', '[MASK]', 'on', 'it', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 00:03:07,902.902 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['banana', 'table', '[UNK]', 'letter', 'jar', 'bag', 'bottle', 'sign', 'label', 'cloth', 'shirt', 'man', 'person', 'can', 'bunch', 'lid', 'board', 'wall', 'store', 'hand', 'shoe', 'display', 'wheel', 'bananas', 'woman', 'logo', 'cap', 'word', 'towel', 'rack', 'jacket', 'fruit', 'shelf', 'head', 'top', 'box', 'hair', 'backpack', 'market', 'bowl', 'banner', 'picture', 'hat', 'yellow', 'stem', 'poster', 'apple', 'arm', 'bucket', 'spice'] 2022-03-17 00:03:23,794.794 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'hand', 'board', 'design', 'person', 'table', 'wall', 'label', 'picture', 'camera', 'handle', 'wheel', 'jacket', 'logo', 'cloth', 'bunch', 'sleeve', 'lid', 'candle', 'banana', 'jar'] 2022-03-17 00:05:47,720.720 2829:trainer.py:487 do_train_dict(): eta: 12:17:06 iter: 40600 speed: 283.3 images/sec total_norm: 143.6671 (145.1691) loss: 144.4194 (143.9174) masked_loss: 1.5077 (1.5380) tag_loss: 143.0626 (142.3795) time: 1.4331 (1.8072) data: 0.0001 (0.0002) to_device: 0.0050 (0.0050) time_gpu: 1.4279 (1.8020) save_time: 8.8805 (16.9902) lr: 0.000039 max mem: 26307 2022-03-17 00:05:48,081.081 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5142857432365417 2022-03-17 00:05:48,081.081 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 147.3411865234375 2022-03-17 00:05:48,082.082 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.96891763110712 2022-03-17 00:06:08,528.528 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021212834864854813 2022-03-17 00:06:08,528.528 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 00:06:08,529.529 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'road', 'is', 'closed', 'off', 'via', 'signage', '[MASK]', 'cones', 'for', 'extra', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 00:06:08,544.544 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['letter', 'cone', 'sign', 'pole', 'sky', 'street', 'ground', 'car', 'truck', 'sidewalk', 'tree', '[UNK]', 'shadow', 'building', 'word', 'road', 'roof', 'orange', 'light', 'person', 'van', 'parking', 'window', 'man', 'traffic', 'can', 'construction', 'stop', 'background', 'lot', 'base', 'fence', 'flag', 'snow', 'house', 'stripe', 'tire', 'top', 'line', 'suv', 'trash', 'billboard', 'barrel', 'bridge', 'mountain', 'shirt', 'curb', 'post', 'wall', 'bag'] 2022-03-17 00:06:24,437.437 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'can', 'house', 'line', 'building', 'road', 'street', 'car', 'ground', 'board', 'post', 'wall', 'lot', 'cover', 'window', 'tree', 'letter', 'sign', 'sky', 'protection', 'background', 'roof', 'extra', 'truck', 'parking', 'pole', 'trash', 'tire', 'cone', 'curb', 'chimney', 'weed', 'signage'] 2022-03-17 00:08:48,697.697 2829:trainer.py:487 do_train_dict(): eta: 12:14:21 iter: 40700 speed: 282.9 images/sec total_norm: 145.4901 (148.0564) loss: 140.8905 (141.4956) masked_loss: 1.4829 (1.5416) tag_loss: 139.2438 (139.9540) time: 1.4329 (1.8097) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4276 (1.8045) save_time: 8.8805 (16.9902) lr: 0.000039 max mem: 26307 2022-03-17 00:08:49,057.057 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7272727489471436 2022-03-17 00:08:49,058.058 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 117.97797393798828 2022-03-17 00:08:49,058.058 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.97352829166488 2022-03-17 00:09:09,789.789 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021244116127490997 2022-03-17 00:09:09,789.789 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 00:09:09,790.790 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'on', 'the', 'pitching', 'mound', 'in', 'a', '"', 'after', 'pitching', '"', 'position', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 00:09:09,805.805 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'leg', 'dirt', 'shirt', 'uniform', 'head', 'hand', 'grass', 'man', 'letter', 'jersey', 'shoe', 'glove', 'baseball', 'field', 'mound', 'logo', 'arm', 'cap', 'face', 'player', 'hat', 'shadow', 'mouth', 'ear', 'ball', 'ground', 'hair', 'nose', 'name', 'number', 'pitcher', 'stripe', 'sleeve', 'sock', 'helmet', 'patch', 'pitch', 'foot', 'line', 'belt', 'eye', 'person', 'wall', 'game', 'beard', 'home', 'neck', 'finger', 'professional'] 2022-03-17 00:09:25,666.666 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'name', 'hand', 'face', 'field', 'position', 'mouth', 'baseball', 'letter', 'shirt', 'jersey', 'leg', 'ear', 'shadow', 'grass', 'hat', 'uniform', 'dirt', 'pitcher', 'logo', 'beard', 'shoe', 'mound', 'pitching', 'glove'] 2022-03-17 00:11:49,624.624 2829:trainer.py:487 do_train_dict(): eta: 12:11:36 iter: 40800 speed: 283.0 images/sec total_norm: 145.7211 (150.5294) loss: 142.6655 (143.6273) masked_loss: 1.4637 (1.4953) tag_loss: 141.0935 (142.1320) time: 1.4335 (1.8093) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4283 (1.8041) save_time: 8.8805 (16.9902) lr: 0.000039 max mem: 26307 2022-03-17 00:11:49,993.993 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5757575631141663 2022-03-17 00:11:49,993.993 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 121.43850708007812 2022-03-17 00:11:49,993.993 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.98243673273286 2022-03-17 00:12:13,069.069 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02125268056988716 2022-03-17 00:12:13,069.069 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 00:12:13,070.070 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'laptop', 'sitting', 'on', 'a', '##unda', 'with', 'books', 'beside', 'it', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 00:12:13,085.085 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['laptop', 'keyboard', 'table', 'screen', 'floor', '[UNK]', 'leg', 'key', 'computer', 'book', 'chair', 'wall', 'cord', 'desk', 'paper', 'mouse', 'room', 'box', 'hat', 'pad', 'open', 'notebook', 'top', 'window', 'wire', 'coffee', 'logo', 'ipod', 'door', 'shelf', 'light', 'rug', 'napkin', 'cup', 'wooden', 'pen', 'cd', 'plate', 'person', 'next', 'stool', 'handle', 'phone', 'pillow', 'bowl', 'cable', 'tray', 'small', 'button', 'glass'] 2022-03-17 00:12:29,168.168 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'book', 'floor', 'table', 'wall', 'key', 'chair', 'computer', 'box', 'sitting', 'screen', 'leg', 'desk', 'keyboard', 'cord', 'laptop'] 03-17 00:13:05.805 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-17 00:13:05.805 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-17 00:13:06.929 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 95}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 94}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 97}] 2022-03-17 00:14:52,030.030 2829:trainer.py:487 do_train_dict(): eta: 12:08:51 iter: 40900 speed: 280.7 images/sec total_norm: 147.4822 (149.9739) loss: 141.3504 (142.6642) masked_loss: 1.5605 (1.5514) tag_loss: 139.9812 (141.1129) time: 1.4326 (1.8240) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4273 (1.8190) save_time: 8.8805 (16.9902) lr: 0.000038 max mem: 26307 2022-03-17 00:14:52,391.391 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6176470518112183 2022-03-17 00:14:52,392.392 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 166.48965454101562 2022-03-17 00:14:52,392.392 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.98645333080756 2022-03-17 00:15:13,240.240 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02129349298775196 2022-03-17 00:15:13,240.240 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 00:15:13,241.241 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'couple', '[MASK]', 'asian', 'people', 'eating', 'dinner', '[MASK]', 'a', 'restaurant', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 00:15:13,256.256 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hair', 'shirt', 'wall', 'table', 'glass', 'face', 'woman', 'restaurant', 'man', 'chair', 'head', 'hand', 'person', 'straw', 'nose', 'plate', 'glasses', 'eye', 'fork', '[UNK]', 'napkin', 'mouth', 'cup', 'brick', 'arm', 'booth', 'candle', 'seat', 'boy', 'eyebrow', 'salt', 'spoon', 'food', 'ear', 'wine', 'knife', 'drink', 'window', 'pizza', 'phone', 'juice', 'girl', 'bowl', 'menu', 'cake', 'water', 'couch', 'picture', 'logo', 'sign'] 2022-03-17 00:15:29,191.191 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'hand', 'face', 'woman', 'hair', 'girl', 'person', 'table', 'wall', 'food', 'seat', 'boy', 'glass', 'couple', 'eye', 'chair', 'shirt', 'nose', 'wine', 'dinner', 'restaurant', 'plate', 'brick', 'knife', 'glasses', 'logo', 'booth', 'fork', 'cake', 'sauce', 'necklace', 'straw', 'candle', 'dessert', 'napkin'] 2022-03-17 00:17:52,707.707 2829:trainer.py:487 do_train_dict(): eta: 12:06:05 iter: 41000 speed: 283.4 images/sec total_norm: 145.9117 (148.7334) loss: 141.9776 (144.4098) masked_loss: 1.5249 (1.5139) tag_loss: 140.9367 (142.8960) time: 1.4325 (1.8067) data: 0.0001 (0.0005) to_device: 0.0051 (0.0050) time_gpu: 1.4272 (1.8012) save_time: 8.8805 (16.9902) lr: 0.000038 max mem: 26307 2022-03-17 00:17:53,069.069 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5142857432365417 2022-03-17 00:17:53,069.069 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 133.8832550048828 2022-03-17 00:17:53,069.069 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.98577220944593 2022-03-17 00:18:13,771.771 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021356800571084023 2022-03-17 00:18:13,771.771 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 00:18:13,772.772 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'computer', 'screen', 'with', 'a', 'melting', 'apple', 'on', 'it', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 00:18:13,787.787 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['computer', 'monitor', 'desk', 'screen', 'wall', 'keyboard', 'table', 'mouse', 'light', 'laptop', 'lamp', '[UNK]', 'stand', 'speaker', 'cord', 'pad', 'base', 'logo', 'wire', 'curtain', 'picture', 'box', 'television', 'room', 'book', 'cup', 'icon', 'front', 'window', 'desktop', 'shelf', 'man', 'top', 'paper', 'phone', 'mug', 'green', 'image', 'hair', 'shade', 'handle', 'clock', 'next', 'glass', 'cat', 'head', 'cell', 'bottle', 'apple', 'shadow'] 2022-03-17 00:18:29,640.640 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['light', 'table', 'wall', 'base', 'stand', 'computer', 'screen', 'desk', 'speaker', 'apple', 'mouse', 'monitor', 'shade', 'keyboard', 'lamp', 'cord', 'laptop', 'icon'] 2022-03-17 00:20:53,635.635 2829:trainer.py:487 do_train_dict(): eta: 12:03:20 iter: 41100 speed: 283.0 images/sec total_norm: 148.2693 (150.8415) loss: 137.9602 (139.7773) masked_loss: 1.4969 (1.5241) tag_loss: 136.3296 (138.2532) time: 1.4321 (1.8094) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4270 (1.8043) save_time: 8.8805 (16.9902) lr: 0.000038 max mem: 26307 2022-03-17 00:20:53,996.996 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5428571701049805 2022-03-17 00:20:53,997.997 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 113.35835266113281 2022-03-17 00:20:53,997.997 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.99389804914159 2022-03-17 00:21:14,729.729 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02137037180364132 2022-03-17 00:21:14,730.730 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 00:21:14,730.730 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'she', 'is', 'talking', 'on', 'her', 'phone', '[MASK]', 'of', 'the', 'restaurant', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 00:21:14,746.746 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hair', 'window', 'sunglasses', 'woman', 'building', 'door', 'face', 'jean', 'sign', '[UNK]', 'sweater', 'shirt', 'phone', 'store', 'head', 'sidewalk', 'bench', 'hand', 'wall', 'car', 'girl', 'plant', 'cell', 'necklace', 'light', 'tree', 'pole', 'letter', 'banner', 'shadow', 'handle', 'fire', 'arm', 'person', 'chair', 'reflection', 'front', 'pot', 'restaurant', 'ground', 'jacket', 'nose', 'paint', 'glass', 'man', 'umbrella', 'bag', 'street', 'lady', 'glasses'] 2022-03-17 00:21:30,687.687 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'hand', 'face', 'building', 'door', 'woman', 'car', 'hair', 'wall', 'phone', 'plant', 'window', 'watch', 'cell', 'sign', 'jean', 'shirt', 'nose', 'restaurant', 'shadow', 'ceiling', 'reflection', 'banner', 'decoration', 'sidewalk', 'necklace', 'sweater', 'sunglasses'] 2022-03-17 00:23:54,581.581 2829:trainer.py:487 do_train_dict(): eta: 12:00:34 iter: 41200 speed: 283.0 images/sec total_norm: 144.8334 (148.6476) loss: 138.5683 (139.7458) masked_loss: 1.3650 (1.4131) tag_loss: 137.3355 (138.3327) time: 1.4319 (1.8094) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4267 (1.8042) save_time: 8.8805 (16.9902) lr: 0.000038 max mem: 26307 2022-03-17 00:23:54,942.942 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6470588445663452 2022-03-17 00:23:54,943.943 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 157.1354522705078 2022-03-17 00:23:54,943.943 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.98711486705568 2022-03-17 00:24:15,777.777 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021388206630945206 2022-03-17 00:24:15,777.777 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 00:24:15,778.778 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'refrigerator', 'and', 'counter', 'in', 'a', 'small', 'absorption', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 00:24:15,793.793 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['window', 'cabinet', '[UNK]', 'shelf', 'ceiling', 'wall', 'kitchen', 'floor', 'door', 'handle', 'refrigerator', 'outlet', 'sink', 'drawer', 'light', 'paper', 'switch', 'stove', 'top', 'tile', 'frame', 'sign', 'room', 'oven', 'wood', 'wooden', 'vent', 'empty', 'glass', 'tree', 'box', 'rack', 'table', 'counter', 'towel', 'bottle', 'large', 'island', 'bar', 'cord', 'mirror', 'fridge', 'board', 'chair', 'view', 'hood', 'hand', 'label', 'cart', 'reflection'] 2022-03-17 00:24:31,648.648 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'small', 'room', 'top', 'light', 'floor', 'wall', 'paper', 'window', 'metal', 'kitchen', 'counter', 'frame', 'handle', 'cabinet', 'ceiling', 'sink', 'shelf', 'drawer', 'outlet', 'tile', 'refrigerator'] 2022-03-17 00:26:55,796.796 2829:trainer.py:487 do_train_dict(): eta: 11:57:49 iter: 41300 speed: 282.5 images/sec total_norm: 144.9906 (148.4171) loss: 141.1563 (141.4852) masked_loss: 1.4459 (1.5125) tag_loss: 139.8734 (139.9727) time: 1.4329 (1.8122) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4278 (1.8070) save_time: 8.8805 (16.9902) lr: 0.000038 max mem: 26307 2022-03-17 00:26:56,155.155 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.65625 2022-03-17 00:26:56,156.156 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 133.70407104492188 2022-03-17 00:26:56,156.156 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.99148092408112 2022-03-17 00:27:17,017.017 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021377334371209145 2022-03-17 00:27:17,017.017 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 00:27:17,018.018 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'couple', 'of', 'wild', 'animals', 'walking', '[MASK]', '[MASK]', 'the', 'day', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 00:27:17,033.033 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'bird', 'ground', 'elephant', 'field', 'bush', 'grass', 'leg', 'trunk', 'water', 'animal', 'ear', 'tail', 'head', '[UNK]', 'shadow', 'dirt', 'person', 'group', 'log', 'branch', 'duck', 'herd', 'hill', 'large', 'stick', 'wing', 'sheep', 'rock', 'pole', 'flock', 'small', 'grassy', 'bank', 'dog', 'body', 'river', 'couple', 'dry', 'next', 'open', 'white', 'area', 'fence', 'cow', 'man', 'wild', 'car', 'many', 'mound'] 2022-03-17 00:27:33,086.086 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['day', 'water', 'field', 'ground', 'person', 'couple', 'structure', 'tree', 'wild', 'bird', 'grass', 'tail', 'bush', 'dirt', 'shelter', 'elephant', 'mound'] 2022-03-17 00:29:56,878.878 2829:trainer.py:487 do_train_dict(): eta: 11:55:03 iter: 41400 speed: 282.7 images/sec total_norm: 146.6220 (149.9863) loss: 138.0694 (139.8062) masked_loss: 1.4989 (1.4981) tag_loss: 136.2042 (138.3082) time: 1.4334 (1.8108) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4282 (1.8057) save_time: 8.8805 (16.9902) lr: 0.000038 max mem: 26307 2022-03-17 00:29:57,238.238 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6666666865348816 2022-03-17 00:29:57,239.239 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 170.69540405273438 2022-03-17 00:29:57,239.239 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 70.99186012543828 2022-03-17 00:30:18,249.249 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021445048972964287 2022-03-17 00:30:18,249.249 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 00:30:18,250.250 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'sheep', ',', 'one', 'looking', 'at', 'the', 'camera', ',', 'while', '[MASK]', 'other', 'looks', 'away', 'are', 'in', 'the', 'wilderness', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 00:30:18,265.265 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['head', 'face', 'leg', 'sheep', 'tree', 'horn', 'fence', 'grass', 'wire', 'ground', 'trunk', 'post', 'ear', 'nose', 'pole', 'field', 'goat', 'bush', 'animal', 'ram', 'rock', 'leaf', 'branch', 'plant', 'mouth', 'standing', 'hill', 'wood', 'green', 'path', 'flower', 'white', 'dirt', 'forest', 'foot', 'fern', '[UNK]', 'log', 'eye', 'stick', 'couple', 'area', 'black', 'tail', 'grassy', 'bird', 'dog', 'top', 'sky', 'hillside'] 2022-03-17 00:30:34,174.174 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['other', 'head', 'face', 'field', 'ground', 'post', 'plant', 'tree', 'leg', 'ear', 'bird', 'camera', 'grass', 'pole', 'horn', 'wire', 'sheep', 'fence', 'wilderness', 'goat'] 2022-03-17 00:32:57,996.996 2829:trainer.py:487 do_train_dict(): eta: 11:52:17 iter: 41500 speed: 282.7 images/sec total_norm: 146.9815 (150.6781) loss: 139.4754 (141.0921) masked_loss: 1.4573 (1.4615) tag_loss: 137.8322 (139.6306) time: 1.4341 (1.8112) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4287 (1.8060) save_time: 8.8805 (16.9902) lr: 0.000038 max mem: 26307 2022-03-17 00:32:58,358.358 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6470588445663452 2022-03-17 00:32:58,359.359 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 118.09123229980469 2022-03-17 00:32:58,359.359 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.00295742658469 2022-03-17 00:33:19,472.472 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021421164274215698 2022-03-17 00:33:19,472.472 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 00:33:19,473.473 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'several', 'young', 'skate', '##board', '##ers', 'near', 'a', 'puddle', '[MASK]', 'water', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 00:33:19,488.488 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'building', 'man', '[UNK]', 'person', 'window', 'sky', 'jean', 'boy', 'hat', 'ground', 'sidewalk', 'shoe', 'hair', 'head', 'sign', 'cap', 'reflection', 'street', 'railing', 'city', 'light', 'group', 'hand', 'arm', 'ladder', 'floor', 'pole', 'rail', 'banner', 'fence', 'balcony', 'wheel', 'trick', 'glove', 'young', 'road', 'board', 'billboard', 'water', 'curb', 'car', 'walkway', 'bench', 'woman', 'air', 'skate', 'crane', 'bottle', 'number'] 2022-03-17 00:33:35,363.363 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'water', 'building', 'road', 'young', 'ground', 'hair', 'person', 'boy', 'window', 'sign', 'sky', 'jean', 'shirt', 'rail', 'wheel', 'hat', 'cap', 'reflection', 'shoe', 'sidewalk', 'railing', 'puddle'] 2022-03-17 00:35:59,084.084 2829:trainer.py:487 do_train_dict(): eta: 11:49:31 iter: 41600 speed: 282.7 images/sec total_norm: 146.7032 (149.6729) loss: 143.8828 (142.5958) masked_loss: 1.4399 (1.4936) tag_loss: 142.1304 (141.1022) time: 1.4328 (1.8109) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4278 (1.8058) save_time: 8.8805 (16.9902) lr: 0.000037 max mem: 26307 2022-03-17 00:35:59,445.445 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5 2022-03-17 00:35:59,445.445 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 144.02047729492188 2022-03-17 00:35:59,445.445 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.00820345615597 2022-03-17 00:36:20,612.612 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021423395723104477 2022-03-17 00:36:20,612.612 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 00:36:20,613.613 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'in', '[MASK]', 'swinging', 'pose', 'with', 'a', 'tennis', 'ra', '##c', '##quet', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 00:36:20,628.628 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['leg', 'sock', 'shoe', 'short', '[UNK]', 'man', 'tennis', 'court', 'shirt', 'hand', 'arm', 'head', 'wall', 'ball', 'shadow', 'hair', 'player', 'line', 'letter', 'ground', 'logo', 'handle', 'face', 'knee', 'band', 'ear', 'hat', 'male', 'person', 'sign', 'nose', 'stripe', 'writing', 'cap', 'string', 'match', 'mouth', 'beard', 'wrist', 'uniform', 'stand', 'eye', 'white', 'sleeve', 'serve', 'ready', 'air', 'game', 'chair', 'banner'] 2022-03-17 00:36:36,580.580 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'line', 'court', 'short', 'ground', 'hair', 'person', 'wall', 'arm', 'chair', 'letter', 'shirt', 'leg', 'tennis', 'shadow', 'jacket', 'bench', 'logo', 'shoe', 'swinging', 'pose', 'stripe', 'sock'] 2022-03-17 00:39:00,138.138 2829:trainer.py:487 do_train_dict(): eta: 11:46:46 iter: 41700 speed: 282.8 images/sec total_norm: 144.4076 (145.8507) loss: 139.5413 (141.3665) masked_loss: 1.5289 (1.5295) tag_loss: 138.0480 (139.8371) time: 1.4334 (1.8105) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4282 (1.8053) save_time: 8.8805 (16.9902) lr: 0.000037 max mem: 26307 2022-03-17 00:39:00,499.499 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7058823704719543 2022-03-17 00:39:00,499.499 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 124.20304870605469 2022-03-17 00:39:00,499.499 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.01553769088818 2022-03-17 00:39:21,750.750 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021411824971437454 2022-03-17 00:39:21,750.750 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 00:39:21,750.750 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', '[MASK]', 'a', 'tennis', 'rack', '##et', 'is', 'standing', 'on', '[MASK]', 'court', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 00:39:21,765.765 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shoe', '[UNK]', 'line', 'court', 'sock', 'tennis', 'shirt', 'hand', 'ground', 'short', 'woman', 'head', 'leg', 'hair', 'arm', 'handle', 'man', 'logo', 'person', 'girl', 'ball', 'fence', 'wall', 'uniform', 'player', 'net', 'face', 'hat', 'ear', 'tree', 'pole', 'letter', 'sign', 'skirt', 'ponytail', 'cap', 'chair', 'boy', 'jersey', 'stripe', 'top', 'watch', 'bracelet', 'band', 'bag', 'car', 'banner', 'wrist', 'shadow', 'glasses'] 2022-03-17 00:39:37,667.667 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'line', 'player', 'court', 'short', 'ground', 'hair', 'wall', 'arm', 'shirt', 'leg', 'background', 'nose', 'ear', 'tennis', 'bottle', 'hat', 'cap', 'logo', 'fence', 'shoe', 'sunglasses', 'stripe', 'sock'] 2022-03-17 00:42:01,417.417 2829:trainer.py:487 do_train_dict(): eta: 11:44:00 iter: 41800 speed: 282.4 images/sec total_norm: 145.6436 (148.5254) loss: 140.2720 (142.2225) masked_loss: 1.4315 (1.4636) tag_loss: 138.9981 (140.7589) time: 1.4333 (1.8128) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4281 (1.8077) save_time: 8.8805 (16.9902) lr: 0.000037 max mem: 26307 2022-03-17 00:42:01,777.777 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6176470518112183 2022-03-17 00:42:01,777.777 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 135.86753845214844 2022-03-17 00:42:01,778.778 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.02708155556908 2022-03-17 00:42:23,030.030 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02141238935291767 2022-03-17 00:42:23,031.031 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 00:42:23,031.031 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'large', 'bed', 'with', 'a', 'attached', 'tables', '[MASK]', '[MASK]', 'lights', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 00:42:23,047.047 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'bed', 'floor', 'room', 'lamp', 'pillow', 'book', '[UNK]', 'table', 'mattress', 'light', 'sheet', 'nightstand', 'blanket', 'window', 'bedroom', 'chair', 'rug', 'vase', 'carpet', 'cushion', 'drawer', 'cup', 'shelf', 'door', 'blind', 'clock', 'tile', 'shade', 'tray', 'leg', 'reflection', 'box', 'cord', 'desk', 'white', 'phone', 'large', 'shadow', 'alarm', 'frame', 'seat', 'flower', 'cabinet', 'paper', 'television', 'remote', 'speaker', 'bottom', 'picture'] 2022-03-17 00:42:39,076.076 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'room', 'large', 'book', 'door', 'light', 'cup', 'floor', 'bed', 'wall', 'glass', 'attached', 'sheet', 'blanket', 'item', 'pillow', 'carpet', 'lamp', 'shelf', 'mattress', 'rug', 'cushion'] 03-17 00:43:07.021 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-17 00:43:07.021 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-17 00:43:08.347 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-17 00:45:02,627.627 2829:trainer.py:487 do_train_dict(): eta: 11:41:14 iter: 41900 speed: 282.5 images/sec total_norm: 146.8112 (148.5743) loss: 145.0779 (144.7114) masked_loss: 1.5479 (1.5577) tag_loss: 143.8079 (143.1536) time: 1.4322 (1.8121) data: 0.0001 (0.0002) to_device: 0.0050 (0.0050) time_gpu: 1.4271 (1.8070) save_time: 8.8805 (16.9902) lr: 0.000037 max mem: 26307 2022-03-17 00:45:02,989.989 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5757575631141663 2022-03-17 00:45:02,990.990 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 122.1563491821289 2022-03-17 00:45:02,990.990 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.04008935292562 2022-03-17 00:45:24,171.171 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021418746560811996 2022-03-17 00:45:24,171.171 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 00:45:24,172.172 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'couple', 'of', 'different', 'soccer', 'players', 'are', 'competing', '[MASK]', 'the', 'field', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 00:45:24,187.187 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['short', 'hair', 'shirt', 'ball', 'sock', 'shoe', 'man', 'soccer', 'uniform', 'grass', 'hand', 'arm', 'tree', 'field', 'bag', 'head', 'backpack', 'jersey', 'logo', 'boy', 'leg', 'face', 'number', 'player', 'ground', 'line', 'game', 'pole', 'camera', '[UNK]', 'person', 'knee', 'background', 'young', 'stripe', 'group', 'fence', 'couple', 'blue', 'bottle', 'other', 'mouth', 'watch', 'back', 'white', 'trash', 'air', 'ear', 'bracelet', 'glove'] 2022-03-17 00:45:40,127.127 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['back', 'head', 'man', 'hand', 'number', 'different', 'player', 'short', 'field', 'ground', 'hair', 'arm', 'boy', 'couple', 'tree', 'ball', 'shirt', 'jersey', 'leg', 'background', 'bag', 'soccer', 'grass', 'uniform', 'logo', 'shoe', 'backpack', 'sock'] 2022-03-17 00:48:03,916.916 2829:trainer.py:487 do_train_dict(): eta: 11:38:28 iter: 42000 speed: 282.4 images/sec total_norm: 145.9572 (148.6094) loss: 143.3673 (143.4991) masked_loss: 1.4955 (1.4897) tag_loss: 141.8718 (142.0094) time: 1.4315 (1.8129) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4261 (1.8077) save_time: 8.8805 (16.9902) lr: 0.000037 max mem: 26307 2022-03-17 00:48:04,278.278 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7142857313156128 2022-03-17 00:48:04,278.278 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 145.319091796875 2022-03-17 00:48:04,278.278 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.04135333897099 2022-03-17 00:48:25,488.488 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02141851745545864 2022-03-17 00:48:25,489.489 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 00:48:25,489.489 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'tooth', 'brush', 'and', 'tube', 'of', 'tooth', 'paste', 'on', 'glass', 'surface', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 00:48:25,505.505 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'shelf', '[UNK]', 'handle', 'table', 'brush', 'glass', 'knife', 'object', 'tile', 'blade', 'cabinet', 'top', 'bottle', 'mirror', 'scissors', 'line', 'base', 'frame', 'ledge', 'white', 'door', 'board', 'microwave', 'kitchen', 'window', 'sink', 'counter', 'head', 'spoon', 'plate', 'water', 'shadow', 'container', 'panel', 'light', 'dish', 'leaf', 'button', 'small', 'reflection', 'vase', 'bar', 'screw', 'cup', 'tooth', 'drawer', 'clock', 'plant', 'hole'] 2022-03-17 00:48:41,389.389 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'door', 'table', 'wall', 'glass', 'surface', 'label', 'shadow', 'tube', 'brush', 'shelf', 'screw', 'tooth', 'paste'] 2022-03-17 00:51:05,172.172 2829:trainer.py:487 do_train_dict(): eta: 11:35:42 iter: 42100 speed: 282.5 images/sec total_norm: 147.9737 (150.6611) loss: 140.9359 (141.5875) masked_loss: 1.4245 (1.4539) tag_loss: 139.3352 (140.1336) time: 1.4322 (1.8126) data: 0.0001 (0.0005) to_device: 0.0050 (0.0049) time_gpu: 1.4270 (1.8072) save_time: 8.8805 (16.9902) lr: 0.000037 max mem: 26307 2022-03-17 00:51:05,533.533 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-17 00:51:05,534.534 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 181.57708740234375 2022-03-17 00:51:05,534.534 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.03069684177778 2022-03-17 00:51:27,005.005 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021461354568600655 2022-03-17 00:51:27,005.005 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 00:51:27,005.005 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'an', 'airport', 'with', 'a', 'large', 'white', 'passenger', 'jet', 'sitting', '[MASK]', 'a', 'tar', '##mac', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 00:51:27,021.021 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['window', 'sky', 'airport', 'airplane', 'floor', 'engine', 'wing', 'tail', 'building', '[UNK]', 'carpet', 'pole', 'ground', 'cloud', 'wall', 'truck', 'vehicle', 'person', 'runway', 'cart', 'wheel', 'man', 'car', 'terminal', 'cone', 'large', 'body', 'front', 'windshield', 'cockpit', 'chair', 'light', 'door', 'luggage', 'van', 'frame', 'plane', 'stair', 'bus', 'logo', 'gate', 'box', 'sign', 'nose', 'leg', 'seat', 'line', 'shadow', 'city', 'passenger'] 2022-03-17 00:51:42,965.965 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'body', 'building', 'large', 'white', 'football', 'car', 'ground', 'person', 'floor', 'wall', 'engine', 'airport', 'window', 'wing', 'sky', 'vehicle', 'passenger', 'truck', 'shadow', 'wheel', 'terminal', 'tail', 'cloud', 'pole', 'jet', 'runway', 'carpet', 'balcony', 'airplane', 'windshield'] 2022-03-17 00:54:06,555.555 2829:trainer.py:487 do_train_dict(): eta: 11:32:56 iter: 42200 speed: 282.3 images/sec total_norm: 148.0372 (150.4387) loss: 142.9486 (143.4243) masked_loss: 1.5630 (1.5307) tag_loss: 141.3828 (141.8936) time: 1.4329 (1.8138) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4278 (1.8086) save_time: 8.8805 (16.9902) lr: 0.000036 max mem: 26307 2022-03-17 00:54:06,917.917 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7058823704719543 2022-03-17 00:54:06,917.917 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 135.8490753173828 2022-03-17 00:54:06,917.917 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.03660656543488 2022-03-17 00:54:28,390.390 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021487630903720856 2022-03-17 00:54:28,391.391 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 00:54:28,391.391 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'kitchen', 'is', 'displayed', 'in', 'a', 'house', 'with', 'wooden', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 00:54:28,407.407 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['cabinet', 'kitchen', 'microwave', '[UNK]', 'handle', 'door', 'wall', 'outlet', 'oven', 'drawer', 'maker', 'coffee', 'refrigerator', 'cord', 'sink', 'bottle', 'stove', 'towel', 'window', 'ceiling', 'light', 'floor', 'pot', 'paper', 'control', 'kettle', 'counter', 'panel', 'container', 'clock', 'plug', 'jar', 'top', 'display', 'steel', 'knob', 'cup', 'picture', 'glass', 'box', 'knife', 'block', 'bowl', 'book', 'stainless', 'lid', 'magnet', 'telephone', 'can', 'white'] 2022-03-17 00:54:44,364.364 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'house', 'door', 'board', 'table', 'wall', 'window', 'kitchen', 'coffee', 'wooden', 'handle', 'cabinet', 'bottle', 'pan', 'sink', 'soap', 'glasses', 'maker', 'drawer', 'outlet', 'jar', 'stove', 'oven', 'microwave'] 2022-03-17 00:57:08,188.188 2829:trainer.py:487 do_train_dict(): eta: 11:30:09 iter: 42300 speed: 281.9 images/sec total_norm: 148.8757 (150.3574) loss: 138.9896 (139.6231) masked_loss: 1.5204 (1.5630) tag_loss: 137.2247 (138.0601) time: 1.4339 (1.8163) data: 0.0001 (0.0002) to_device: 0.0052 (0.0051) time_gpu: 1.4287 (1.8111) save_time: 8.8805 (16.9902) lr: 0.000036 max mem: 26307 2022-03-17 00:57:08,548.548 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6857143044471741 2022-03-17 00:57:08,549.549 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 130.0687255859375 2022-03-17 00:57:08,549.549 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.04761534816814 2022-03-17 00:57:29,971.971 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021544501185417175 2022-03-17 00:57:29,972.972 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 00:57:29,972.972 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'mountain', 'biker', 'pumps', 'his', '[MASK]', 'in', 'celebration', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 00:57:29,987.987 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['ground', 'bush', 'tree', 'dirt', 'rock', '[UNK]', 'sky', 'head', 'leg', 'hill', 'branch', 'shirt', 'road', 'shoe', 'stick', 'tail', 'wheel', 'grass', 'arm', 'hat', 'ear', 'brush', 'pole', 'bottle', 'man', 'truck', 'tire', 'hand', 'person', 'bike', 'bag', 'top', 'mountain', 'cloud', 'bench', 'motorcycle', 'face', 'handle', 'woman', 'jean', 'wood', 'jacket', 'trunk', 'short', 'bicycle', 'hair', 'field', 'cap', 'backpack', 'horse'] 2022-03-17 00:57:45,997.997 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'road', 'short', 'ground', 'arm', 'hill', 'mountain', 'tree', 'sky', 'shirt', 'leg', 'wheel', 'bush', 'dirt', 'fist', 'celebration', 'bike', 'bicycle', 'helmet', 'shoe', 'glove', 'sunglasses', 'sock', 'biker'] 2022-03-17 01:00:09,714.714 2829:trainer.py:487 do_train_dict(): eta: 11:27:23 iter: 42400 speed: 282.1 images/sec total_norm: 144.9691 (149.5484) loss: 141.5725 (143.4463) masked_loss: 1.4272 (1.4615) tag_loss: 139.9841 (141.9847) time: 1.4331 (1.8153) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4279 (1.8101) save_time: 8.8805 (16.9902) lr: 0.000036 max mem: 26307 2022-03-17 01:00:10,075.075 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4571428596973419 2022-03-17 01:00:10,075.075 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 145.24624633789062 2022-03-17 01:00:10,076.076 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.0422955052993 2022-03-17 01:00:31,773.773 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02159140259027481 2022-03-17 01:00:31,773.773 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 01:00:31,774.774 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'yellow', 'flower', 'emerges', 'from', 'a', 'blue', 'vase', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 01:00:31,789.789 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['vase', 'tile', 'wall', 'flower', '[UNK]', 'sink', 'stem', 'leaf', 'bathroom', 'water', 'blue', 'bottle', 'glass', 'mirror', 'table', 'line', 'counter', 'reflection', 'base', 'clear', 'handle', 'top', 'light', 'container', 'shelf', 'cabinet', 'kitchen', 'bottom', 'door', 'background', 'rack', 'window', 'floor', 'picture', 'white', 'cap', 'ring', 'soap', 'shadow', 'hand', 'paper', 'next', 'towel', 'tiled', 'holder', 'plant', 'shirt', 'knob', 'ledge', 'man'] 2022-03-17 01:00:47,721.721 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'water', 'top', 'blue', 'wall', 'base', 'yellow', 'bathroom', 'flower', 'leaf', 'sole', 'stem', 'reflection', 'container', 'tile', 'ledge', 'vase'] 2022-03-17 01:03:11,397.397 2829:trainer.py:487 do_train_dict(): eta: 11:24:37 iter: 42500 speed: 281.8 images/sec total_norm: 146.7993 (149.5911) loss: 143.1370 (143.0844) masked_loss: 1.4714 (1.5115) tag_loss: 141.5977 (141.5729) time: 1.4322 (1.8169) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4271 (1.8117) save_time: 8.8805 (16.9902) lr: 0.000036 max mem: 26307 2022-03-17 01:03:11,758.758 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6176470518112183 2022-03-17 01:03:11,758.758 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 115.534423828125 2022-03-17 01:03:11,758.758 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.05517115167609 2022-03-17 01:03:33,164.164 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021591635420918465 2022-03-17 01:03:33,164.164 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 01:03:33,164.164 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'person', 'in', 'an', 'ocean', 'that', 'is', 'falling', 'off', 'his', 'surf', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 01:03:33,180.180 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wave', 'water', '[UNK]', 'ocean', 'head', 'arm', 'man', 'hand', 'surfer', 'hair', 'sky', 'leg', 'person', 'foot', 'board', 'short', 'shirt', 'foam', 'suit', 'wet', 'top', 'logo', 'surf', 'shore', 'mountain', 'beach', 'face', 'reflection', 'design', 'white', 'ripple', 'horizon', 'watch', 'woman', 'small', 'back', 'large', 'rock', 'body', 'blue', 'wake', 'boat', 'name', 'sea', 'spray', 'trunk', 'big', 'crest', 'cloud', 'fin'] 2022-03-17 01:03:49,040.040 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'water', 'board', 'hair', 'person', 'ocean', 'wave', 'logo', 'ripple', 'surfer'] 2022-03-17 01:06:12,894.894 2829:trainer.py:487 do_train_dict(): eta: 11:21:51 iter: 42600 speed: 282.1 images/sec total_norm: 146.3640 (148.7817) loss: 141.2117 (141.8549) masked_loss: 1.4163 (1.4640) tag_loss: 139.8284 (140.3910) time: 1.4330 (1.8150) data: 0.0001 (0.0002) to_device: 0.0052 (0.0051) time_gpu: 1.4277 (1.8097) save_time: 8.8805 (16.9902) lr: 0.000036 max mem: 26307 2022-03-17 01:06:13,258.258 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.529411792755127 2022-03-17 01:06:13,258.258 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 125.75096893310547 2022-03-17 01:06:13,258.258 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.06623224798913 2022-03-17 01:06:34,853.853 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021602030843496323 2022-03-17 01:06:34,853.853 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 01:06:34,854.854 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'people', 'are', 'watching', 'a', '[MASK]', 'doing', 'his', 'thing', 'on', 'a', 'skate', '##board', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 01:06:34,869.869 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'wheel', 'shirt', 'hand', 'ramp', 'person', 'park', 'arm', 'boy', 'man', 'shoe', 'short', 'skate', 'bowl', 'sock', 'pole', 'helmet', 'head', 'sky', 'knee', 'hat', 'tree', 'pad', 'building', 'fence', 'logo', 'background', 'skater', 'car', 'leg', 'sign', 'trick', 'board', 'pool', 'railing', 'light', 'umbrella', 'cap', 'tent', 'can', 'ground', 'rim', 'beach', 'bicycle', 'table', 'truck', 'shadow', 'hair', 'child', 'cloud'] 2022-03-17 01:06:50,780.780 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'park', 'woman', 'short', 'thing', 'person', 'child', 'table', 'arm', 'boy', 'guy', 'tree', 'shirt', 'background', 'bowl', 'camera', 'wheel', 'hat', 'knee', 'pole', 'tent', 'helmet', 'shoe', 'pad', 'umbrella', 'ramp', 'railing', 'skate', 'sock'] 2022-03-17 01:09:14,586.586 2829:trainer.py:487 do_train_dict(): eta: 11:19:05 iter: 42700 speed: 281.8 images/sec total_norm: 144.8645 (148.6564) loss: 141.0954 (143.3357) masked_loss: 1.4752 (1.4932) tag_loss: 139.7748 (141.8425) time: 1.4333 (1.8169) data: 0.0001 (0.0002) to_device: 0.0052 (0.0051) time_gpu: 1.4281 (1.8116) save_time: 8.8805 (16.9902) lr: 0.000036 max mem: 26307 2022-03-17 01:09:14,947.947 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5757575631141663 2022-03-17 01:09:14,947.947 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 141.65634155273438 2022-03-17 01:09:14,947.947 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.07369912227738 2022-03-17 01:09:36,365.365 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02159293182194233 2022-03-17 01:09:36,365.365 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 01:09:36,366.366 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'sandwich', 'is', 'seen', '[MASK]', 'a', 'paper', 'plate', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 01:09:36,381.381 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sandwich', 'bread', 'table', 'plate', 'cheese', 'meat', 'food', '[UNK]', 'handle', 'crust', 'bacon', 'onion', 'bowl', 'fork', 'cup', 'napkin', 'egg', 'shadow', 'container', 'glass', 'mug', 'cut', 'white', 'coffee', 'knife', 'spoon', 'paper', 'half', 'top', 'sauce', 'tomato', 'toast', 'jar', 'hole', 'logo', 'french', 'bottle', 'chip', 'liquid', 'ham', 'rim', 'piece', 'close', 'blade', 'dish', 'leaf', 'butter', 'lid', 'bottom', 'cloth'] 2022-03-17 01:09:52,326.326 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['table', 'food', 'key', 'paper', 'plate', 'meat', 'button', 'wire', 'bread', 'mouse', 'keyboard', 'cord', 'sandwich', 'crust', 'napkin'] 2022-03-17 01:12:16,282.282 2829:trainer.py:487 do_train_dict(): eta: 11:16:18 iter: 42800 speed: 281.8 images/sec total_norm: 146.3325 (148.7189) loss: 144.5788 (144.6105) masked_loss: 1.4489 (1.5052) tag_loss: 143.1053 (143.1054) time: 1.4324 (1.8170) data: 0.0001 (0.0002) to_device: 0.0052 (0.0051) time_gpu: 1.4272 (1.8117) save_time: 8.8805 (16.9902) lr: 0.000036 max mem: 26307 2022-03-17 01:12:16,641.641 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6000000238418579 2022-03-17 01:12:16,641.641 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 146.23428344726562 2022-03-17 01:12:16,641.641 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.07512057466663 2022-03-17 01:12:38,319.319 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02159733511507511 2022-03-17 01:12:38,319.319 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 01:12:38,320.320 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'little', 'boy', 'and', '[MASK]', '[MASK]', 'looking', 'at', 'a', 'police', 'motorcycle', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 01:12:38,335.335 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['motorcycle', 'shirt', '[UNK]', 'light', 'tire', 'bike', 'hair', 'head', 'shoe', 'ground', 'boy', 'man', 'fender', 'hand', 'tank', 'short', 'shadow', 'tree', 'wheel', 'sidewalk', 'road', 'person', 'street', 'boot', 'arm', 'bag', 'jean', 'helmet', 'child', 'pole', 'building', 'sky', 'woman', 'seat', 'engine', 'pipe', 'car', 'leg', 'sunglasses', 'line', 'bush', 'fence', 'curb', 'wall', 'grass', 'mirror', 'windshield', 'gas', 'sign', 'window'] 2022-03-17 01:12:54,300.300 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'little', 'face', 'father', 'building', 'road', 'light', 'short', 'ground', 'hair', 'police', 'word', 'child', 'boy', 'engine', 'window', 'tree', 'letter', 'sky', 'shirt', 'ear', 'shadow', 'flag', 'wheel', 'mirror', 'bench', 'horn', 'bike', 'fence', 'motorcycle', 'boot', 'skirt', 'shoe', 'tire', 'sunglasses', 'fender', 'windshield'] 03-17 01:13:08.389 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-17 01:13:08.389 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-17 01:13:09.534 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-17 01:15:17,978.978 2829:trainer.py:487 do_train_dict(): eta: 11:13:32 iter: 42900 speed: 281.8 images/sec total_norm: 145.3660 (148.1841) loss: 141.8306 (142.2851) masked_loss: 1.5162 (1.5272) tag_loss: 140.3550 (140.7579) time: 1.4316 (1.8170) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4264 (1.8118) save_time: 8.8805 (16.9902) lr: 0.000035 max mem: 26307 2022-03-17 01:15:18,339.339 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6470588445663452 2022-03-17 01:15:18,339.339 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 167.22744750976562 2022-03-17 01:15:18,339.339 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.06134655087493 2022-03-17 01:15:40,066.066 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02160204015672207 2022-03-17 01:15:40,066.066 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 01:15:40,067.067 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'kitchen', 'that', 'has', '[MASK]', 'pink', 'cups', 'on', 'the', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 01:15:40,083.083 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'wall', 'stove', 'kitchen', 'cup', 'oven', 'knob', 'handle', 'door', 'cabinet', 'bottle', 'mug', 'top', 'shelf', 'jar', 'floor', 'coffee', 'lid', 'tile', 'window', 'pot', 'container', 'drawer', 'rack', 'spoon', 'sink', 'outlet', 'bowl', 'counter', 'bucket', 'pole', 'table', 'cord', 'box', 'refrigerator', 'kettle', 'glass', 'microwave', 'pitcher', 'white', 'wire', 'dish', 'plate', 'fan', 'can', 'fire', 'towel', 'small', 'cap', 'display'] 2022-03-17 01:15:55,979.979 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'top', 'door', 'cup', 'floor', 'wall', 'kitchen', 'counter', 'handle', 'cabinet', 'fan', 'bottle', 'liquid', 'lid', 'bucket', 'rack', 'mug', 'stove', 'coaster', 'knob', 'oven', 'refrigerator', 'kettle'] 2022-03-17 01:18:19,724.724 2829:trainer.py:487 do_train_dict(): eta: 11:10:45 iter: 43000 speed: 281.7 images/sec total_norm: 145.9306 (147.8647) loss: 136.6767 (137.7407) masked_loss: 1.4589 (1.4774) tag_loss: 134.9663 (136.2633) time: 1.4332 (1.8174) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4280 (1.8123) save_time: 8.8805 (16.9902) lr: 0.000035 max mem: 26307 2022-03-17 01:18:20,084.084 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5428571701049805 2022-03-17 01:18:20,084.084 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 144.02027893066406 2022-03-17 01:18:20,084.084 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.06931020820777 2022-03-17 01:18:41,731.731 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021593688055872917 2022-03-17 01:18:41,731.731 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 01:18:41,731.731 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'small', 'kid', 'and', 'a', 'person', 'with', '[MASK]', 'tooth', '##brush', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 01:18:41,747.747 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['mouth', 'eye', 'nose', 'hair', 'face', 'hand', 'girl', 'wall', 'child', '[UNK]', 'finger', 'ear', 'towel', 'forehead', 'person', 'teeth', 'arm', 'shirt', 'nail', 'head', 'cheek', 'sleeve', 'little', 'young', 'eyebrow', 'tongue', 'bathroom', 'handle', 'ring', 'neck', 'tile', 'brush', 'thumb', 'door', 'woman', 'boy', 'bang', 'chest', 'curtain', 'toilet', 'floor', 'blanket', 'knob', 'small', 'baby', 'kid', 'chin', 'toy', 'holder', 'tub'] 2022-03-17 01:18:57,654.654 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'hand', 'face', 'small', 'hair', 'girl', 'mouth', 'person', 'child', 'wall', 'eye', 'ring', 'shirt', 'finger', 'nose', 'ear', 'kid', 'handle', 'button', 'holder', 'sleeve', 'cuff'] 2022-03-17 01:21:21,610.610 2829:trainer.py:487 do_train_dict(): eta: 11:07:59 iter: 43100 speed: 281.5 images/sec total_norm: 147.0580 (148.7670) loss: 141.0400 (141.6718) masked_loss: 1.4622 (1.4708) tag_loss: 140.0831 (140.2011) time: 1.4334 (1.8189) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4282 (1.8137) save_time: 8.8805 (16.9902) lr: 0.000035 max mem: 26307 2022-03-17 01:21:21,970.970 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5714285969734192 2022-03-17 01:21:21,971.971 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 156.8335723876953 2022-03-17 01:21:21,971.971 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.06686391653838 2022-03-17 01:21:43,713.713 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021623732522130013 2022-03-17 01:21:43,713.713 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 01:21:43,714.714 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'the', '[MASK]', 'counter', 'and', 'sink', '[MASK]', 'dishes', 'on', 'them', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 01:21:43,729.729 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['kitchen', '[UNK]', 'cabinet', 'wall', 'window', 'handle', 'bottle', 'sink', 'stove', 'bowl', 'door', 'chair', 'oven', 'knob', 'lid', 'cup', 'floor', 'knife', 'drawer', 'pot', 'towel', 'dish', 'rack', 'board', 'tile', 'box', 'block', 'top', 'cutting', 'shelf', 'basket', 'spoon', 'sponge', 'pipe', 'plate', 'container', 'mug', 'bag', 'picture', 'paper', 'washing', 'white', 'refrigerator', 'hood', 'plant', 'jar', 'pan', 'pitcher', 'soap', 'counter'] 2022-03-17 01:21:59,641.641 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'top', 'cup', 'design', 'floor', 'wall', 'chair', 'window', 'kitchen', 'picture', 'bowl', 'counter', 'handle', 'plate', 'cabinet', 'knife', 'bottle', 'sink', 'pipe', 'shade', 'pot', 'holder', 'dish', 'towel', 'basket', 'lid', 'drawer', 'rack', 'spoon', 'stove', 'knob', 'oven', 'rug'] 2022-03-17 01:24:23,682.682 2829:trainer.py:487 do_train_dict(): eta: 11:05:12 iter: 43200 speed: 281.2 images/sec total_norm: 145.5534 (147.8206) loss: 139.4266 (140.4954) masked_loss: 1.5146 (1.5144) tag_loss: 138.0130 (138.9810) time: 1.4345 (1.8208) data: 0.0001 (0.0005) to_device: 0.0052 (0.0050) time_gpu: 1.4292 (1.8153) save_time: 8.8805 (16.9902) lr: 0.000035 max mem: 26307 2022-03-17 01:24:24,044.044 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4848484992980957 2022-03-17 01:24:24,044.044 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 144.73623657226562 2022-03-17 01:24:24,044.044 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.0748316300108 2022-03-17 01:24:46,006.006 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021662456914782524 2022-03-17 01:24:46,007.007 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 01:24:46,007.007 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'den', 'with', 'a', 'table', ',', 'couch', '[MASK]', 'television', 'and', '[MASK]', '##s', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 01:24:46,022.022 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'table', 'room', 'floor', 'television', 'glass', 'picture', 'cabinet', 'couch', 'chair', 'ceiling', 'bowl', 'living', 'door', 'coffee', 'pillow', 'shelf', 'lamp', 'center', 'entertainment', 'plant', 'light', 'book', 'shade', 'window', 'speaker', 'stand', 'plate', 'sofa', 'drawer', 'rug', 'vase', '[UNK]', 'clock', 'reflection', 'frame', 'curtain', 'furniture', 'cushion', 'painting', 'flower', 'candle', 'screen', 'dresser', 'mirror', 'base', 'tray', 'outlet', 'large', 'wooden'] 2022-03-17 01:25:02,001.001 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['room', 'door', 'light', 'cup', 'television', 'floor', 'table', 'wall', 'glass', 'chair', 'plant', 'figure', 'window', 'picture', 'coffee', 'leg', 'bowl', 'clock', 'cabinet', 'speaker', 'ceiling', 'couch', 'flower', 'remote', 'sculpture', 'switch', 'den', 'pot', 'pillow', 'curtain', 'shelf', 'drawer', 'cushion'] 2022-03-17 01:27:25,522.522 2829:trainer.py:487 do_train_dict(): eta: 11:02:26 iter: 43300 speed: 281.6 images/sec total_norm: 145.5002 (149.2901) loss: 139.2665 (141.3544) masked_loss: 1.5310 (1.5388) tag_loss: 137.6039 (139.8156) time: 1.4333 (1.8183) data: 0.0001 (0.0002) to_device: 0.0052 (0.0051) time_gpu: 1.4280 (1.8131) save_time: 8.8805 (16.9902) lr: 0.000035 max mem: 26307 2022-03-17 01:27:25,882.882 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5714285969734192 2022-03-17 01:27:25,882.882 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 142.45013427734375 2022-03-17 01:27:25,882.882 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.07677402584234 2022-03-17 01:27:47,679.679 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021678118035197258 2022-03-17 01:27:47,679.679 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 01:27:47,680.680 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'the', 'sun', 'sets', '[MASK]', 'a', 'vacant', 'sienna', 'area', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 01:27:47,695.695 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'pole', 'light', 'building', 'tree', 'street', 'sun', 'traffic', 'sign', 'fence', 'sunset', 'roof', 'city', 'car', 'person', '[UNK]', 'road', 'bird', 'window', 'tower', 'dusk', 'line', 'van', 'sidewalk', 'night', 'ground', 'top', 'trunk', 'truck', 'parking', 'branch', 'pillar', 'chimney', 'bench', 'wire', 'lot', 'telephone', 'man', 'arrow', 'intersection', 'bicycle', 'horizon', 'lamp', 'bus', 'box', 'hill', 'stop', 'post', 'fire', 'statue'] 2022-03-17 01:28:03,646.646 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['house', 'area', 'building', 'street', 'light', 'ground', 'arm', 'sun', 'window', 'tree', 'sign', 'sky', 'industrial', 'traffic', 'roof', 'truck', 'grass', 'pole', 'fence', 'vacant', 'balcony'] 2022-03-17 01:30:27,494.494 2829:trainer.py:487 do_train_dict(): eta: 10:59:39 iter: 43400 speed: 281.4 images/sec total_norm: 146.2397 (150.1466) loss: 142.5784 (143.7262) masked_loss: 1.5533 (1.5260) tag_loss: 140.9010 (142.2003) time: 1.4336 (1.8197) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4285 (1.8146) save_time: 8.8805 (16.9902) lr: 0.000035 max mem: 26307 2022-03-17 01:30:27,855.855 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6111111044883728 2022-03-17 01:30:27,855.855 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 139.47756958007812 2022-03-17 01:30:27,856.856 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.08488894933942 2022-03-17 01:30:49,771.771 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02168019860982895 2022-03-17 01:30:49,772.772 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 01:30:49,772.772 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'an', 'elephant', 'walking', 'around', 'in', 'the', '[MASK]', 'on', '[MASK]', 'sunny', 'day', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 01:30:49,788.788 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'leg', 'sky', 'elephant', 'tail', 'grass', 'ear', 'ground', 'rock', 'water', 'forest', 'trunk', 'head', 'back', 'shadow', 'dirt', 'log', 'background', 'fence', 'pole', '[UNK]', 'zoo', 'sand', 'foot', 'field', 'fountain', 'pipe', 'roof', 'pool', 'eye', 'enclosure', 'bush', 'tank', 'boulder', 'cloud', 'barrel', 'large', 'road', 'puddle', 'structure', 'building', 'stick', 'area', 'person', 'animal', 'body', 'wall', 'bird', 'waterfall', 'mountain'] 2022-03-17 01:31:05,640.640 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'back', 'head', 'day', 'water', 'ground', 'rock', 'forest', 'tree', 'box', 'sky', 'walking', 'leg', 'background', 'ear', 'shadow', 'grass', 'tail', 'bush', 'stick', 'pole', 'dirt', 'fence', 'zoo', 'elephant', 'sunny'] 2022-03-17 01:33:29,513.513 2829:trainer.py:487 do_train_dict(): eta: 10:56:52 iter: 43500 speed: 281.3 images/sec total_norm: 146.1836 (147.8293) loss: 143.3249 (142.5361) masked_loss: 1.5377 (1.5199) tag_loss: 142.1102 (141.0163) time: 1.4332 (1.8202) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4283 (1.8151) save_time: 8.8805 (16.9902) lr: 0.000035 max mem: 26307 2022-03-17 01:33:29,873.873 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.42424243688583374 2022-03-17 01:33:29,873.873 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 118.48931884765625 2022-03-17 01:33:29,873.873 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.0874801688238 2022-03-17 01:33:51,953.953 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021685268729925156 2022-03-17 01:33:51,953.953 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 01:33:51,953.953 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'don', '[MASK]', 'are', 'inside', 'of', 'a', 'foil', '##ed', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 01:33:51,969.969 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['foil', '[UNK]', 'sandwich', 'reflection', 'paper', 'table', 'person', 'pastry', 'light', 'hand', 'leaf', 'bread', 'aluminum', 'food', 'tin', 'hot', 'bun', 'stem', 'hole', 'next', 'close', 'white', 'onion', 'top', 'half', 'tomato', 'finger', 'green', 'other', 'plate', 'plastic', 'arm', 'head', 'cheese', 'dog', 'end', 'glass', 'cup', 'sleeve', 'small', 'wall', 'large', 'logo', 'hamburger', 'couple', 'ice', 'shoe', 'meat', 'piece', 'ready'] 2022-03-17 01:34:07,929.929 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'paper', 'reflection', 'sandwich', 'container', 'foil', 'tomato', 'pastry'] 2022-03-17 01:36:31,863.863 2829:trainer.py:487 do_train_dict(): eta: 10:54:06 iter: 43600 speed: 280.8 images/sec total_norm: 146.7132 (149.1552) loss: 141.0429 (142.9773) masked_loss: 1.4744 (1.5146) tag_loss: 139.4549 (141.4628) time: 1.4339 (1.8236) data: 0.0001 (0.0002) to_device: 0.0052 (0.0050) time_gpu: 1.4285 (1.8184) save_time: 8.8805 (16.9902) lr: 0.000034 max mem: 26307 2022-03-17 01:36:32,222.222 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5625 2022-03-17 01:36:32,223.223 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 105.09342193603516 2022-03-17 01:36:32,223.223 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.10114144296887 2022-03-17 01:36:54,277.277 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021676894277334213 2022-03-17 01:36:54,277.277 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 01:36:54,278.278 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'person', 'in', '[MASK]', 'boots', ',', 'on', 'ski', '##is', 'with', '[MASK]', 'ski', 'poles', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 01:36:54,293.293 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['ski', 'boot', 'snow', 'pole', '[UNK]', 'ground', 'shadow', 'person', 'track', 'leg', 'strap', 'glove', 'skier', 'foot', 'handle', 'sky', 'jacket', 'hand', 'stripe', 'line', 'man', 'coat', 'slope', 'snowy', 'tree', 'pair', 'red', 'tag', 'poles', 'trail', 'face', 'orange', 'back', 'backpack', 'woman', 'hill', 'couple', 'country', 'top', 'hat', 'gear', 'other', 'shirt', 'design', 'side', 'footprint', 'equipment', 'way', 'scarf', 'sunglasses'] 2022-03-17 01:37:10,154.154 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'line', 'ground', 'track', 'person', 'leg', 'snow', 'shadow', 'pole', 'ski', 'boot', 'strap'] 2022-03-17 01:39:33,958.958 2829:trainer.py:487 do_train_dict(): eta: 10:51:19 iter: 43700 speed: 281.2 images/sec total_norm: 146.5332 (148.3827) loss: 144.1066 (143.8117) masked_loss: 1.4286 (1.4971) tag_loss: 142.9181 (142.3146) time: 1.4330 (1.8209) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4277 (1.8158) save_time: 8.8805 (16.9902) lr: 0.000034 max mem: 26307 2022-03-17 01:39:34,319.319 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6060606241226196 2022-03-17 01:39:34,320.320 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 131.47254943847656 2022-03-17 01:39:34,320.320 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.10382006048611 2022-03-17 01:39:56,492.492 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021680761128664017 2022-03-17 01:39:56,492.492 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 01:39:56,493.493 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'little', 'girl', 'wearing', 'a', 'helmet', 'and', 'holding', 'a', '[MASK]', 'bat', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 01:39:56,508.508 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'helmet', 'bat', 'person', 'grass', 'hand', 'eye', 'girl', 'short', 'child', 'baseball', 'man', 'fence', 'boy', 'shoe', 'letter', 'nose', '[UNK]', 'sock', 'hair', 'face', 'handle', 'arm', 'jersey', 'tree', 'jean', 'head', 'dirt', 'hat', 'field', 'woman', 'leg', 'young', 'little', 'spectator', 'logo', 'ground', 'writing', 'necklace', 'number', 'kid', 'pole', 'strap', 'stripe', 'mouth', 'chair', 'cap', 'ball', 'swing', 'bracelet'] 2022-03-17 01:40:12,483.483 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'hand', 'little', 'short', 'field', 'ground', 'girl', 'person', 'child', 'arm', 'boy', 'eye', 'baseball', 'letter', 'shirt', 'jersey', 'leg', 'handle', 'grass', 'hat', 'pole', 'bat', 'fence', 'helmet', 'shoe', 'strap', 'spectator', 'sock'] 2022-03-17 01:42:36,389.389 2829:trainer.py:487 do_train_dict(): eta: 10:48:32 iter: 43800 speed: 280.7 images/sec total_norm: 143.8127 (149.4073) loss: 139.7323 (141.3447) masked_loss: 1.3922 (1.4657) tag_loss: 137.9831 (139.8790) time: 1.4331 (1.8243) data: 0.0001 (0.0001) to_device: 0.0052 (0.0051) time_gpu: 1.4278 (1.8190) save_time: 8.8805 (16.9902) lr: 0.000034 max mem: 26307 2022-03-17 01:42:36,749.749 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6666666865348816 2022-03-17 01:42:36,750.750 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 119.12374114990234 2022-03-17 01:42:36,750.750 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.11166987495162 2022-03-17 01:42:59,095.095 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.0216783806681633 2022-03-17 01:42:59,095.095 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 01:42:59,096.096 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'boat', 'in', 'the', '[MASK]', 'with', 'several', 'people', 'on', 'it', 'with', 'a', 'flag', 'on', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 01:42:59,111.111 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['water', 'person', 'boat', 'man', 'flag', 'sky', 'wave', 'reflection', 'ripple', 'wake', 'pole', 'jacket', 'ocean', 'motor', 'shadow', '[UNK]', 'head', 'door', 'small', 'raft', 'shirt', 'mountain', 'large', 'engine', 'snow', 'front', 'railing', 'hat', 'antenna', 'splash', 'seat', 'white', 'group', 'couple', 'speed', 'clear', 'fishing', 'stripe', 'top', 'ski', 'hair', 'blue', 'tire', 'open', 'american', 'vest', 'day', 'middle', 'ship', 'cabin'] 03-17 01:43:09.635 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-17 01:43:09.635 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-17 01:43:10.304 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 13}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 9}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 12}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}] 2022-03-17 01:43:15,097.097 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'several', 'water', 'person', 'sky', 'boat', 'ocean', 'wave', 'motor', 'flag', 'wake', 'pole', 'jacket', 'reflection', 'ripple'] 2022-03-17 01:45:38,723.723 2829:trainer.py:487 do_train_dict(): eta: 10:45:46 iter: 43900 speed: 280.8 images/sec total_norm: 144.3893 (146.5414) loss: 138.6804 (140.2232) masked_loss: 1.4478 (1.4852) tag_loss: 137.3409 (138.7380) time: 1.4340 (1.8233) data: 0.0001 (0.0002) to_device: 0.0052 (0.0051) time_gpu: 1.4287 (1.8181) save_time: 8.8805 (16.9902) lr: 0.000034 max mem: 26307 2022-03-17 01:45:39,084.084 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5151515007019043 2022-03-17 01:45:39,084.084 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 114.24961853027344 2022-03-17 01:45:39,085.085 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.1247149814259 2022-03-17 01:46:01,268.268 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021682890132069588 2022-03-17 01:46:01,269.269 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 01:46:01,269.269 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'zebra', 'has', 'stuck', 'its', 'head', 'into', 'a', '[MASK]', 'with', 'a', 'passenger', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 01:46:01,284.284 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['car', 'head', 'tree', 'sky', 'nose', 'ear', 'door', 'eye', '[UNK]', 'jean', 'zebra', 'face', 'mouth', 'windshield', 'grass', 'person', 'shirt', 'hair', 'window', 'muzzle', 'neck', 'jacket', 'fence', 'hand', 'button', 'mirror', 'man', 'vent', 'dashboard', 'bag', 'leg', 'mane', 'seat', 'arm', 'handle', 'ground', 'camera', 'stripe', 'light', 'roof', 'pole', 'next', 'front', 'wheel', 'steering', 'black', 'driver', 'close', 'wall', 'side'] 2022-03-17 01:46:17,114.114 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'door', 'car', 'hair', 'mouth', 'person', 'eye', 'tree', 'sky', 'jean', 'shirt', 'nose', 'bag', 'passenger', 'shoe', 'vent', 'windshield', 'zebra'] 2022-03-17 01:48:41,171.171 2829:trainer.py:487 do_train_dict(): eta: 10:42:59 iter: 44000 speed: 280.6 images/sec total_norm: 147.5025 (150.8654) loss: 140.0975 (141.8490) masked_loss: 1.5035 (1.4859) tag_loss: 138.3601 (140.3631) time: 1.4320 (1.8245) data: 0.0001 (0.0002) to_device: 0.0052 (0.0051) time_gpu: 1.4267 (1.8192) save_time: 8.8805 (16.9902) lr: 0.000034 max mem: 26307 2022-03-17 01:48:41,533.533 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6764705777168274 2022-03-17 01:48:41,534.534 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 167.94195556640625 2022-03-17 01:48:41,534.534 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.12121427194332 2022-03-17 01:49:03,951.951 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021646689623594284 2022-03-17 01:49:03,951.951 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 01:49:03,952.952 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', '[MASK]', 'carolyn', 'the', 'floor', 'holding', 'a', 'ra', '##c', '##quet', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 01:49:03,967.967 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'leg', 'sock', 'hand', 'short', 'man', '[UNK]', 'face', 'shoe', 'tennis', 'head', 'arm', 'ear', 'background', 'nose', 'hair', 'eye', 'stripe', 'logo', 'mouth', 'ball', 'handle', 'ground', 'band', 'knee', 'sleeve', 'shadow', 'player', 'court', 'wall', 'string', 'line', 'beard', 'white', 'male', 'floor', 'glasses', 'letter', 'collar', 'design', 'photo', 'red', 'match', 'finger', 'air', 'neck', 'wrist', 'grass', 'hat', 'net'] 2022-03-17 01:49:19,898.898 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'band', 'short', 'ground', 'hair', 'mouth', 'floor', 'wall', 'arm', 'eye', 'shirt', 'leg', 'background', 'nose', 'ear', 'handle', 'tennis', 'string', 'knee', 'shoe', 'stripe', 'sock'] 2022-03-17 01:51:43,559.559 2829:trainer.py:487 do_train_dict(): eta: 10:40:12 iter: 44100 speed: 280.7 images/sec total_norm: 146.6852 (150.8122) loss: 139.5549 (140.8831) masked_loss: 1.4715 (1.5170) tag_loss: 138.4529 (139.3660) time: 1.4338 (1.8239) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4286 (1.8188) save_time: 8.8805 (16.9902) lr: 0.000034 max mem: 26307 2022-03-17 01:51:43,921.921 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.529411792755127 2022-03-17 01:51:43,921.921 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 121.39515686035156 2022-03-17 01:51:43,921.921 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.12570458839382 2022-03-17 01:52:06,042.042 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02166532538831234 2022-03-17 01:52:06,043.043 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 01:52:06,043.043 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'group', 'of', 'zebra', '[MASK]', 'foraging', 'in', 'the', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 01:52:06,059.059 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['zebra', 'shadow', 'leg', 'ground', 'mane', 'grass', 'head', '[UNK]', 'ear', 'stripe', 'neck', 'tail', 'eye', 'dirt', 'nose', 'mouth', 'rock', 'fence', 'tree', 'trunk', 'field', 'body', 'log', 'bush', 'background', 'spot', 'water', 'pole', 'branch', 'wall', 'hay', 'other', 'reflection', 'bird', 'back', 'couple', 'next', 'area', 'mesh', 'group', 'leaf', 'grazing', 'shade', 'plant', 'food', 'line', 'zoo', 'post', 'hill', 'grassy'] 2022-03-17 01:52:21,969.969 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'group', 'ground', 'mouth', 'eye', 'neck', 'tree', 'leg', 'ear', 'shadow', 'grass', 'tail', 'dirt', 'fence', 'log', 'stripe', 'mane', 'zebra'] 2022-03-17 01:54:46,085.085 2829:trainer.py:487 do_train_dict(): eta: 10:37:25 iter: 44200 speed: 280.5 images/sec total_norm: 147.2032 (152.4778) loss: 143.3514 (144.0022) masked_loss: 1.5084 (1.5121) tag_loss: 141.9420 (142.4901) time: 1.4341 (1.8252) data: 0.0001 (0.0001) to_device: 0.0052 (0.0051) time_gpu: 1.4290 (1.8200) save_time: 8.8805 (16.9902) lr: 0.000033 max mem: 26307 2022-03-17 01:54:46,447.447 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6666666865348816 2022-03-17 01:54:46,447.447 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 146.79379272460938 2022-03-17 01:54:46,447.447 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.12467487240484 2022-03-17 01:55:08,871.871 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021704429760575294 2022-03-17 01:55:08,871.871 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 01:55:08,872.872 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'bike', 'is', 'tied', 'to', 'a', '[MASK]', 'while', 'a', 'su', '##fer', 'walks', 'on', 'the', '[MASK]', '[MASK]', 'the', 'water', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 01:55:08,887.887 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'water', 'cloud', 'pole', 'bicycle', '[UNK]', 'bike', 'wave', 'ocean', 'bird', 'man', 'board', 'beach', 'wheel', 'person', 'sand', 'hair', 'tire', 'horizon', 'wing', 'short', 'boy', 'head', 'shirt', 'surf', 'arm', 'woman', 'rock', 'kite', 'basket', 'shore', 'leg', 'seat', 'post', 'bag', 'shadow', 'boat', 'top', 'hat', 'paddle', 'footprint', 'body', 'suit', 'foot', 'back', 'pedal', 'child', 'handle', 'hand', 'ground'] 2022-03-17 01:55:24,831.831 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'water', 'board', 'person', 'seat', 'arm', 'boy', 'wing', 'beach', 'sky', 'shirt', 'ocean', 'leg', 'wave', 'bird', 'wheel', 'sand', 'cloud', 'pole', 'walks', 'bike', 'bicycle', 'pedal'] 2022-03-17 01:57:48,699.699 2829:trainer.py:487 do_train_dict(): eta: 10:34:38 iter: 44300 speed: 280.4 images/sec total_norm: 148.8967 (150.1477) loss: 139.6569 (141.2873) masked_loss: 1.4802 (1.4573) tag_loss: 138.4310 (139.8299) time: 1.4327 (1.8262) data: 0.0001 (0.0005) to_device: 0.0051 (0.0050) time_gpu: 1.4275 (1.8206) save_time: 8.8805 (16.9902) lr: 0.000033 max mem: 26307 2022-03-17 01:57:49,060.060 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-17 01:57:49,061.061 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 133.86709594726562 2022-03-17 01:57:49,061.061 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.13292645119331 2022-03-17 01:58:11,573.573 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02169646881520748 2022-03-17 01:58:11,573.573 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 01:58:11,574.574 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'picture', 'of', 'a', 'steep', '##le', 'and', 'clocks', '[MASK]', 'a', 'building', 'with', 'a', 'snow', '[MASK]', 'roof', 'with', 'a', 'neon', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 01:58:11,589.589 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['clock', 'building', 'tower', 'sky', 'window', 'roof', 'sign', 'light', '[UNK]', 'top', 'snow', 'ground', 'tree', 'hill', 'church', 'hand', 'large', 'pole', 'weather', 'spire', 'wall', 'bird', 'lamp', 'night', 'bush', 'bus', 'tall', 'chimney', 'house', 'cross', 'vane', 'front', 'car', 'dome', 'street', 'door', 'white', 'triangle', 'rock', 'traffic', 'truck', 'road', 'big', 'old', 'middle', 'towering', 'statue', 'train', 'side', 'snowy'] 2022-03-17 01:58:27,532.532 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'building', 'light', 'distance', 'window', 'tree', 'tower', 'branch', 'sign', 'sky', 'roof', 'snow', 'clock', 'neon', 'chimney'] 2022-03-17 02:00:51,147.147 2829:trainer.py:487 do_train_dict(): eta: 10:31:51 iter: 44400 speed: 280.6 images/sec total_norm: 146.3768 (148.8367) loss: 144.4569 (144.4037) masked_loss: 1.5678 (1.5769) tag_loss: 142.9069 (142.8268) time: 1.4327 (1.8245) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4277 (1.8193) save_time: 8.8805 (16.9902) lr: 0.000033 max mem: 26307 2022-03-17 02:00:51,508.508 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4864864945411682 2022-03-17 02:00:51,508.508 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 135.26873779296875 2022-03-17 02:00:51,509.509 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.12974012567756 2022-03-17 02:01:14,037.037 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021763863041996956 2022-03-17 02:01:14,037.037 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 02:01:14,038.038 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'red', '[MASK]', '[MASK]', 'a', 'cemetery', 'beside', 'a', 'forest', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 02:01:14,053.053 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sign', 'tree', 'pole', 'stop', 'letter', 'ground', 'trunk', 'leaf', 'branch', '[UNK]', 'red', 'road', 'bolt', 'fence', 'weed', 'sky', 'grass', 'writing', 'arrow', 'wood', 'plant', 'bush', 'forest', 'post', 'next', 'street', 'dirt', 'wire', 'front', 'corner', 'graffiti', 'way', 'base', 'area', 'line', 'window', 'white', 'side', 'car', 'wooded', 'bench', 'name', 'rock', 'box', 'green', 'word', 'path', 'back', 'light', 'roof'] 2022-03-17 02:01:29,991.991 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'red', 'ground', 'stop', 'forest', 'tree', 'letter', 'sign', 'cemetery', 'grass', 'pole', 'leaf', 'trunk'] 2022-03-17 02:03:53,621.621 2829:trainer.py:487 do_train_dict(): eta: 10:29:04 iter: 44500 speed: 280.6 images/sec total_norm: 145.3201 (148.7217) loss: 142.4798 (143.7655) masked_loss: 1.4907 (1.5357) tag_loss: 140.6557 (142.2298) time: 1.4331 (1.8248) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4280 (1.8196) save_time: 8.8805 (16.9902) lr: 0.000033 max mem: 26307 2022-03-17 02:03:53,982.982 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.47058823704719543 2022-03-17 02:03:53,983.983 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 141.45477294921875 2022-03-17 02:03:53,983.983 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.13863661043312 2022-03-17 02:04:16,478.478 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02181190438568592 2022-03-17 02:04:16,478.478 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 02:04:16,479.479 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'little', 'boy', 'in', 'a', 'car', '[MASK]', 'a', 'sandwich', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 02:04:16,494.494 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['window', 'car', 'tree', '[UNK]', 'person', 'mouth', 'eye', 'bread', 'jacket', 'seat', 'face', 'shirt', 'nose', 'windshield', 'head', 'finger', 'sky', 'hand', 'food', 'ear', 'arm', 'chair', 'button', 'table', 'sleeve', 'dog', 'vehicle', 'man', 'thumb', 'building', 'roof', 'hair', 'banana', 'logo', 'sandwich', 'door', 'hat', 'collar', 'road', 'handle', 'bun', 'paper', 'small', 'curtain', 'light', 'bus', 'wall', 'bear', 'label', 'top'] 2022-03-17 02:04:32,481.481 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'hand', 'little', 'face', 'car', 'hair', 'mouth', 'seat', 'boy', 'eye', 'window', 'tree', 'shirt', 'finger', 'nose', 'ear', 'pocket', 'tag', 'button', 'jacket', 'bread', 'sleeve', 'sandwich', 'strap'] 2022-03-17 02:06:56,275.275 2829:trainer.py:487 do_train_dict(): eta: 10:26:17 iter: 44600 speed: 280.3 images/sec total_norm: 147.1895 (150.9362) loss: 143.2369 (143.5238) masked_loss: 1.5019 (1.5042) tag_loss: 142.0414 (142.0195) time: 1.4331 (1.8264) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4281 (1.8213) save_time: 8.8805 (16.9902) lr: 0.000033 max mem: 26307 2022-03-17 02:06:56,638.638 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6470588445663452 2022-03-17 02:06:56,638.638 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 138.19956970214844 2022-03-17 02:06:56,638.638 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.14188735757098 2022-03-17 02:07:19,088.088 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02181151881814003 2022-03-17 02:07:19,089.089 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 02:07:19,089.089 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'riding', 'a', 'motorcycle', 'next', '[MASK]', 'a', 'red', 'city', 'bus', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 02:07:19,105.105 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['man', 'road', 'helmet', 'jacket', 'motorcycle', 'tire', 'plate', 'street', 'bus', 'license', 'sidewalk', 'bike', 'window', 'curb', 'line', 'officer', 'person', 'boot', 'police', 'pole', '[UNK]', 'light', 'shoe', 'wheel', 'door', 'sign', 'head', 'arrow', 'stripe', 'vehicle', 'shadow', 'safety', 'glove', 'bicycle', 'woman', 'city', 'handle', 'bag', 'policeman', 'car', 'back', 'wall', 'van', 'traffic', 'tree', 'tail', 'pipe', 'building', 'hair', 'mirror'] 2022-03-17 02:07:35,136.136 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['city', 'man', 'line', 'next', 'road', 'street', 'red', 'light', 'police', 'person', 'officer', 'window', 'shirt', 'bus', 'truck', 'plate', 'wheel', 'hat', 'license', 'cap', 'pole', 'jacket', 'bike', 'cop', 'motorcycle', 'boot', 'helmet', 'shoe', 'sidewalk', 'tire', 'curb'] 2022-03-17 02:09:58,959.959 2829:trainer.py:487 do_train_dict(): eta: 10:23:29 iter: 44700 speed: 280.3 images/sec total_norm: 147.8476 (150.8522) loss: 140.2339 (142.5799) masked_loss: 1.4206 (1.4493) tag_loss: 138.4982 (141.1306) time: 1.4340 (1.8269) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4288 (1.8218) save_time: 8.8805 (16.9902) lr: 0.000033 max mem: 26307 2022-03-17 02:09:59,319.319 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5 2022-03-17 02:09:59,319.319 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 131.57757568359375 2022-03-17 02:09:59,319.319 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.14805282013756 2022-03-17 02:10:21,832.832 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021816806867718697 2022-03-17 02:10:21,832.832 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 02:10:21,833.833 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'three', 'pastry', 'dessert', '##s', '[MASK]', 'two', 'glasses', 'of', 'wine', 'are', 'on', 'a', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 02:10:21,848.848 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['plate', 'table', 'light', 'dessert', 'fork', 'cake', '[UNK]', 'food', 'paper', 'car', 'sauce', 'napkin', 'topping', 'background', 'cream', 'piece', 'crust', 'whipped', 'mushroom', 'spoon', 'cup', 'pie', 'restaurant', 'chocolate', 'bread', 'handle', 'newspaper', 'coffee', 'shadow', 'glass', 'bowl', 'reflection', 'layer', 'window', 'slice', 'person', 'pizza', 'dish', 'meat', 'base', 'white', 'olive', 'knife', 'menu', 'bottle', 'ball', 'stem', 'ice', 'wine', 'delicious'] 2022-03-17 02:10:37,758.758 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'light', 'car', 'person', 'table', 'base', 'glass', 'paper', 'shirt', 'wine', 'plate', 'shadow', 'knife', 'meat', 'bread', 'stem', 'fork', 'dish', 'dessert', 'crust', 'napkin', 'topping', 'pastry'] 2022-03-17 02:13:01,687.687 2829:trainer.py:487 do_train_dict(): eta: 10:20:42 iter: 44800 speed: 280.2 images/sec total_norm: 149.9937 (153.1639) loss: 139.3013 (139.9245) masked_loss: 1.4402 (1.4530) tag_loss: 137.5764 (138.4716) time: 1.4335 (1.8273) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4283 (1.8221) save_time: 8.8805 (16.9902) lr: 0.000033 max mem: 26307 2022-03-17 02:13:02,047.047 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6969696879386902 2022-03-17 02:13:02,047.047 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 123.20221710205078 2022-03-17 02:13:02,047.047 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.16377372996578 03-17 02:13:10.405 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-17 02:13:10.405 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-17 02:13:11.103 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}] 2022-03-17 02:13:24,578.578 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021821388974785805 2022-03-17 02:13:24,579.579 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 02:13:24,579.579 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'photograph', '[MASK]', 'a', 'produce', 'stand', 'in', 'a', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 02:13:24,594.594 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['banana', '[UNK]', 'vegetable', 'market', 'shirt', 'person', 'produce', 'pole', 'carrot', 'bag', 'onion', 'table', 'crate', 'stand', 'basket', 'fruit', 'sign', 'potato', 'man', 'woman', 'apple', 'ground', 'hair', 'bunch', 'cabbage', 'mango', 'head', 'skirt', 'pepper', 'stick', 'box', 'ceiling', 'plastic', 'display', 'squash', 'umbrella', 'building', 'hand', 'shelf', 'roof', 'pumpkin', 'leaf', 'bin', 'canopy', 'hat', 'tomato', 'jean', 'window', 'garlic', 'street'] 2022-03-17 02:13:40,481.481 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'ground', 'person', 'table', 'market', 'stand', 'shirt', 'produce', 'roof', 'shadow', 'plastic', 'photograph', 'potato', 'banana', 'vegetable', 'onion', 'carrot', 'pumpkin', 'crate'] 2022-03-17 02:16:04,486.486 2829:trainer.py:487 do_train_dict(): eta: 10:17:55 iter: 44900 speed: 280.1 images/sec total_norm: 147.0449 (149.0564) loss: 136.5453 (140.1855) masked_loss: 1.3919 (1.4591) tag_loss: 135.0207 (138.7264) time: 1.4326 (1.8280) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4273 (1.8229) save_time: 8.8805 (16.9902) lr: 0.000032 max mem: 26307 2022-03-17 02:16:04,848.848 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7142857313156128 2022-03-17 02:16:04,848.848 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 144.12722778320312 2022-03-17 02:16:04,848.848 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.16637097676595 2022-03-17 02:16:27,367.367 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021904485300183296 2022-03-17 02:16:27,368.368 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 02:16:27,368.368 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'dimly', 'lit', 'din', '##ning', 'table', 'with', 'a', 'flower', '[MASK]', '##piece', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 02:16:27,383.383 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['table', 'vase', 'flower', 'glass', 'bottle', 'wine', 'woman', 'leaf', 'shadow', 'stem', 'plant', 'label', 'hair', 'wall', 'water', '[UNK]', 'box', 'nose', 'mouth', 'face', 'shirt', 'watch', 'base', 'bowl', 'room', 'hand', 'eye', 'napkin', 'shelf', 'background', 'book', 'light', 'logo', 'person', 'glasses', 'girl', 'picture', 'head', 'lamp', 'bouquet', 'paper', 'fork', 'candle', 'jacket', 'plate', 'next', 'card', 'chair', 'bag', 'refrigerator'] 2022-03-17 02:16:43,187.187 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'room', 'woman', 'hair', 'table', 'wall', 'glass', 'plant', 'box', 'shirt', 'label', 'bottom', 'nose', 'wine', 'shadow', 'lit', 'bottle', 'flower', 'leaf', 'stem', 'shelf', 'vase', 'dimly'] 2022-03-17 02:19:07,469.469 2829:trainer.py:487 do_train_dict(): eta: 10:15:08 iter: 45000 speed: 279.8 images/sec total_norm: 145.5425 (146.9968) loss: 138.9292 (139.9383) masked_loss: 1.4847 (1.4861) tag_loss: 137.5657 (138.4522) time: 1.4327 (1.8298) data: 0.0001 (0.0002) to_device: 0.0052 (0.0050) time_gpu: 1.4275 (1.8246) save_time: 8.8805 (16.9902) lr: 0.000032 max mem: 26307 2022-03-17 02:19:07,471.471 2829:checkpoint.py:222 save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt 2022-03-17 02:19:16,620.620 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7272727489471436 2022-03-17 02:19:16,620.620 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 136.8945770263672 2022-03-17 02:19:16,621.621 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.17092167247425 2022-03-17 02:19:39,232.232 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02189561165869236 2022-03-17 02:19:39,233.233 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 02:19:39,234.234 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'several', 'bundles', '[MASK]', 'fruit', 'hanging', '[MASK]', 'a', 'plant', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 02:19:39,249.249 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'leaf', 'banana', 'stem', 'trunk', 'bunch', 'plant', 'flower', 'branch', 'sky', '[UNK]', 'green', 'moss', 'fruit', 'vine', 'bush', 'rock', 'large', 'building', 'fence', 'bananas', 'forest', 'top', 'ground', 'jungle', 'shadow', 'bark', 'grass', 'stalk', 'tropical', 'ripe', 'group', 'water', 'tail', 'light', 'lush', 'bottom', 'wire', 'big', 'wall', 'small', 'dirt', 'side', 'red', 'front', 'picture', 'road', 'cluster', 'area', 'fern'] 2022-03-17 02:19:55,041.041 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'rock', 'plant', 'tree', 'branch', 'sky', 'hanging', 'fruit', 'flower', 'leaf', 'stem', 'trunk', 'bunch', 'banana'] 2022-03-17 02:22:17,969.969 2829:trainer.py:487 do_train_dict(): eta: 10:12:24 iter: 45100 speed: 268.8 images/sec total_norm: 147.0945 (149.3187) loss: 139.8319 (140.7661) masked_loss: 1.4648 (1.4864) tag_loss: 138.1013 (139.2797) time: 1.4329 (1.9050) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4278 (1.8121) save_time: 8.8805 (16.0781) lr: 0.000032 max mem: 26307 2022-03-17 02:22:18,331.331 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7575757503509521 2022-03-17 02:22:18,331.331 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 117.73262786865234 2022-03-17 02:22:18,331.331 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.18742867697657 2022-03-17 02:22:41,076.076 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02189384214580059 2022-03-17 02:22:41,077.077 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 02:22:41,077.077 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'bathroom', 'with', 'a', '[MASK]', 'toilet', ',', 'white', '[MASK]', 'and', 'white', 'shower', 'curtain', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 02:22:41,093.093 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'bathroom', 'sink', 'mirror', '[UNK]', 'curtain', 'picture', 'toilet', 'handle', 'towel', 'light', 'floor', 'cabinet', 'lid', 'rug', 'frame', 'door', 'bottle', 'reflection', 'shower', 'tank', 'bowl', 'rack', 'knob', 'soap', 'tile', 'shelf', 'holder', 'drawer', 'base', 'vanity', 'white', 'dish', 'seat', 'basket', 'ring', 'ceiling', 'tub', 'switch', 'decoration', 'rod', 'fixture', 'outlet', 'table', 'box', 'plate', 'stand', 'paper', 'pipe', 'leg'] 2022-03-17 02:22:57,088.088 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'white', 'door', 'light', 'table', 'wall', 'stand', 'ring', 'picture', 'frame', 'tank', 'handle', 'mirror', 'bathroom', 'shower', 'switch', 'sink', 'reflection', 'towel', 'curtain', 'toilet', 'lid', 'tub', 'magnet', 'knob', 'rug'] 2022-03-17 02:25:21,112.112 2829:trainer.py:487 do_train_dict(): eta: 10:09:37 iter: 45200 speed: 279.6 images/sec total_norm: 147.8916 (151.8833) loss: 139.5687 (139.2047) masked_loss: 1.4275 (1.4417) tag_loss: 138.2020 (137.7629) time: 1.4343 (1.8314) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4290 (1.8262) save_time: 8.8805 (16.0781) lr: 0.000032 max mem: 26307 2022-03-17 02:25:21,472.472 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7272727489471436 2022-03-17 02:25:21,473.473 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 127.7848129272461 2022-03-17 02:25:21,473.473 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.19178291135539 2022-03-17 02:25:44,156.156 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02191152796149254 2022-03-17 02:25:44,157.157 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 02:25:44,157.157 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'bowl', 'full', 'of', 'multi', 'colored', 'pasta', 'and', '[MASK]', '##coll', '##i', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 02:25:44,172.172 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'table', 'bowl', 'food', 'pasta', 'shrimp', 'plate', 'rice', 'chicken', 'handle', 'cup', 'tomato', 'spoon', 'salad', 'meat', 'dish', 'pepper', 'fork', 'vegetable', 'pea', 'knife', 'carrot', 'onion', 'corn', 'stem', 'meal', 'container', 'white', 'flower', 'lid', 'mushroom', 'full', 'red', 'cheese', 'fry', 'colorful', 'top', 'blue', 'glass', 'wooden', 'close', 'olive', 'pizza', 'side', 'sausage', 'can', 'lemon', 'mixed', 'large', 'next'] 2022-03-17 02:26:00,096.096 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'full', 'cup', 'table', 'food', 'bowl', 'multi', 'handle', 'plate', 'rice', 'chicken', 'fork', 'vegetable', 'spoon', 'shrimp', 'pasta'] 2022-03-17 02:28:24,316.316 2829:trainer.py:487 do_train_dict(): eta: 10:06:49 iter: 45300 speed: 279.5 images/sec total_norm: 145.3128 (147.4635) loss: 142.8366 (141.5086) masked_loss: 1.3910 (1.4478) tag_loss: 141.4757 (140.0607) time: 1.4335 (1.8320) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4279 (1.8268) save_time: 8.8805 (16.0781) lr: 0.000032 max mem: 26307 2022-03-17 02:28:24,677.677 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5454545617103577 2022-03-17 02:28:24,677.677 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 177.65623474121094 2022-03-17 02:28:24,677.677 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.1881763861568 2022-03-17 02:28:47,621.621 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021909112110733986 2022-03-17 02:28:47,621.621 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 02:28:47,621.621 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'and', 'green', 'train', 'is', '##nesia', 'into', 'a', 'station', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 02:28:47,638.638 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['window', 'train', 'sky', 'bridge', 'number', '[UNK]', 'line', 'light', 'track', 'platform', 'front', 'windshield', 'door', 'sign', 'beam', 'building', 'car', 'yellow', 'pole', 'wire', 'station', 'fence', 'green', 'bumper', 'writing', 'wall', 'roof', 'puddle', 'sidewalk', 'walkway', 'shadow', 'tree', 'cloud', 'engine', 'handle', 'street', 'stripe', 'road', 'logo', 'gravel', 'plate', 'ground', 'water', 'blue', 'letter', 'railing', 'traffic', 'ladder', 'bush', 'pavement'] 2022-03-17 02:29:03,511.511 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['number', 'line', 'station', 'front', 'light', 'green', 'bridge', 'mountain', 'window', 'train', 'sign', 'sky', 'yellow', 'text', 'bus', 'traffic', 'platform', 'handle', 'plate', 'license', 'pole', 'beam', 'ladder', 'stripe', 'puddle'] 2022-03-17 02:31:27,235.235 2829:trainer.py:487 do_train_dict(): eta: 10:04:02 iter: 45400 speed: 279.9 images/sec total_norm: 146.7626 (149.5498) loss: 142.2525 (143.1962) masked_loss: 1.4364 (1.4794) tag_loss: 140.6587 (141.7168) time: 1.4327 (1.8292) data: 0.0001 (0.0005) to_device: 0.0049 (0.0049) time_gpu: 1.4279 (1.8238) save_time: 8.8805 (16.0781) lr: 0.000032 max mem: 26307 2022-03-17 02:31:27,597.597 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6470588445663452 2022-03-17 02:31:27,598.598 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 139.1650390625 2022-03-17 02:31:27,598.598 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.19887241740803 2022-03-17 02:31:50,536.536 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021944560110569 2022-03-17 02:31:50,537.537 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 02:31:50,537.537 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'bunch', 'of', '[MASK]', 'that', 'have', 'some', '[MASK]', 'in', 'it', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 02:31:50,553.553 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['boat', 'shirt', 'hair', 'man', 'hand', 'container', 'rope', 'food', 'water', 'bowl', 'pole', 'head', '[UNK]', 'basket', 'bin', 'boy', 'fish', 'person', 'arm', 'sauce', 'meat', 'paddle', 'bag', 'bucket', 'handle', 'ear', 'tire', 'tray', 'stripe', 'box', 'carrot', 'jean', 'vegetable', 'knife', 'dish', 'small', 'wall', 'foot', 'cloth', 'plastic', 'mirror', 'lid', 'jacket', 'something', 'other', 'dock', 'pizza', 'woman', 'bottle', 'young'] 2022-03-17 02:32:06,456.456 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'hand', 'water', 'hair', 'food', 'arm', 'shirt', 'fish', 'boat', 'bowl', 'meat', 'pole', 'bin', 'rope', 'brush', 'bunch', 'basket', 'container', 'tire', 'sauce', 'bowls', 'paddle'] 2022-03-17 02:34:30,209.209 2829:trainer.py:487 do_train_dict(): eta: 10:01:14 iter: 45500 speed: 279.8 images/sec total_norm: 146.0719 (149.9861) loss: 139.7415 (139.5959) masked_loss: 1.5358 (1.5258) tag_loss: 137.8792 (138.0701) time: 1.4325 (1.8297) data: 0.0001 (0.0002) to_device: 0.0050 (0.0050) time_gpu: 1.4273 (1.8246) save_time: 8.8805 (16.0781) lr: 0.000031 max mem: 26307 2022-03-17 02:34:30,570.570 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7941176295280457 2022-03-17 02:34:30,570.570 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 165.72402954101562 2022-03-17 02:34:30,570.570 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.2002185269406 2022-03-17 02:34:53,718.718 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021948710083961487 2022-03-17 02:34:53,719.719 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 02:34:53,719.719 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'kitchen', 'has', 'a', 'free', 'standing', 'counter', 'and', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 02:34:53,734.734 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['table', 'wall', 'chair', '[UNK]', 'bottle', 'apple', 'bowl', 'clock', 'window', 'tray', 'orange', 'fruit', 'shelf', 'cup', 'room', 'bucket', 'mat', 'drawer', 'handle', 'container', 'lid', 'can', 'stand', 'paper', 'kettle', 'cabinet', 'mirror', 'curtain', 'towel', 'cloth', 'mug', 'light', 'knife', 'plate', 'basket', 'pen', 'kitchen', 'pot', 'coffee', 'knob', 'microwave', 'maker', 'item', 'bag', 'dining', 'label', 'ceiling', 'pitcher', 'box', 'desk'] 2022-03-17 02:35:09,700.700 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'room', 'book', 'light', 'cup', 'free', 'radio', 'table', 'wall', 'standing', 'chair', 'plant', 'window', 'kitchen', 'label', 'coffee', 'wine', 'orange', 'bowl', 'counter', 'frame', 'clock', 'mirror', 'bottle', 'fruit', 'apple', 'flower', 'pen', 'cloth', 'item', 'pot', 'maker', 'dish', 'shelf', 'container', 'tray', 'marker', 'drawer', 'mat', 'bucket', 'jar', 'vase'] 2022-03-17 02:37:33,631.631 2829:trainer.py:487 do_train_dict(): eta: 9:58:27 iter: 45600 speed: 279.1 images/sec total_norm: 146.8478 (151.2619) loss: 137.7772 (141.0801) masked_loss: 1.4549 (1.4667) tag_loss: 136.3211 (139.6134) time: 1.4336 (1.8342) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4284 (1.8291) save_time: 8.8805 (16.0781) lr: 0.000031 max mem: 26307 2022-03-17 02:37:33,991.991 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-17 02:37:33,991.991 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 140.31912231445312 2022-03-17 02:37:33,991.991 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.20183854551753 2022-03-17 02:37:56,956.956 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021926330402493477 2022-03-17 02:37:56,957.957 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 02:37:56,957.957 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'there', 'is', 'a', 'brown', 'couch', 'and', 'white', 'curtains', 'in', 'this', 'living', 'room', '[MASK]', 'are', '[MASK]', 'stairs', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 02:37:56,972.972 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['window', 'room', 'couch', 'floor', 'wall', 'table', 'door', 'curtain', 'chair', 'sofa', 'pillow', 'living', 'ceiling', 'balcony', 'carpet', 'furniture', 'mirror', 'coffee', 'stair', 'cabinet', 'lamp', 'light', 'circle', 'rug', 'glass', '[UNK]', 'ottoman', 'shade', 'design', 'large', 'rod', 'vent', 'staircase', 'handle', 'cushion', 'book', 'television', 'bowl', 'shelf', 'doorway', 'plant', 'picture', 'railing', 'building', 'switch', 'stand', 'area', 'top', 'step', 'stool'] 2022-03-17 02:38:12,831.831 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['room', 'white', 'door', 'living', 'television', 'floor', 'table', 'wall', 'brown', 'chair', 'plant', 'window', 'coffee', 'cabinet', 'bottle', 'ottoman', 'couch', 'hook', 'pillow', 'sofa', 'staircase', 'curtain', 'balcony', 'tray', 'railing', 'dresser', 'cushion', 'stair'] 2022-03-17 02:40:36,741.741 2829:trainer.py:487 do_train_dict(): eta: 9:55:39 iter: 45700 speed: 279.6 images/sec total_norm: 146.2788 (149.3616) loss: 140.8254 (142.2374) masked_loss: 1.5306 (1.5512) tag_loss: 139.0315 (140.6861) time: 1.4337 (1.8311) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4283 (1.8259) save_time: 8.8805 (16.0781) lr: 0.000031 max mem: 26307 2022-03-17 02:40:37,102.102 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-17 02:40:37,102.102 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 137.29315185546875 2022-03-17 02:40:37,103.103 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.20614439951801 2022-03-17 02:41:00,247.247 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02196500077843666 2022-03-17 02:41:00,247.247 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 02:41:00,247.247 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'sheep', 'crowd', 'the', 'street', 'in', 'a', '[MASK]', 'near', '[MASK]', 'and', 'trees', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 02:41:00,263.263 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['building', 'window', 'ground', 'roof', 'sky', 'rock', 'sheep', 'snow', 'house', 'person', 'door', 'pole', 'mountain', 'old', 'wall', 'picture', 'horse', 'hill', 'tree', 'head', 'cloud', 'hat', 'barn', 'animal', '[UNK]', 'white', 'man', 'road', 'photo', 'cow', 'doorway', 'chimney', 'wheel', 'herd', 'coat', 'sign', 'car', 'front', 'light', 'pipe', 'street', 'fence', 'town', 'hillside', 'black', 'large', 'goat', 'carriage', 'cabin', 'snowy'] 2022-03-17 02:41:16,169.169 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'house', 'town', 'building', 'road', 'street', 'ground', 'rock', 'mountain', 'window', 'tree', 'horse', 'sky', 'picture', 'animal', 'roof', 'snow', 'symbol', 'doorway', 'trunk', 'sheep', 'cow'] 03-17 02:43:11.204 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-17 02:43:11.204 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-17 02:43:12.348 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 97}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 89}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 88}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 95}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 97}] 2022-03-17 02:43:39,927.927 2829:trainer.py:487 do_train_dict(): eta: 9:52:51 iter: 45800 speed: 279.5 images/sec total_norm: 145.6557 (148.5404) loss: 141.4648 (141.0630) masked_loss: 1.4525 (1.4922) tag_loss: 139.9415 (139.5707) time: 1.4319 (1.8319) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4272 (1.8269) save_time: 8.8805 (16.0781) lr: 0.000031 max mem: 26307 2022-03-17 02:43:40,291.291 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5151515007019043 2022-03-17 02:43:40,291.291 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 146.31019592285156 2022-03-17 02:43:40,291.291 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.20996450287065 2022-03-17 02:44:03,368.368 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.021973075345158577 2022-03-17 02:44:03,368.368 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 02:44:03,369.369 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'cars', 'are', 'stopped', 'at', 'a', 'traffic', '[MASK]', 'during', 'sunset', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 02:44:03,384.384 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'light', 'pole', 'car', 'road', 'line', 'traffic', 'tree', 'street', 'wire', 'truck', 'bus', 'window', 'sign', '[UNK]', 'cloud', 'mirror', 'intersection', 'windshield', 'building', 'sun', 'ground', 'view', 'power', 'wall', 'tail', 'vehicle', 'lot', 'red', 'wheel', 'person', 'sunset', 'van', 'night', 'picture', 'fence', 'tire', 'back', 'sidewalk', 'side', 'grass', 'door', 'tower', 'fire', 'background', 'city', 'photo', 'busy', 'station', 'bridge'] 2022-03-17 02:44:19,300.300 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['line', 'road', 'street', 'light', 'car', 'window', 'tree', 'sky', 'bus', 'traffic', 'truck', 'mirror', 'pole', 'wire', 'sunset', 'windshield'] 2022-03-17 02:46:43,146.146 2829:trainer.py:487 do_train_dict(): eta: 9:50:03 iter: 45900 speed: 279.4 images/sec total_norm: 148.6446 (151.2845) loss: 141.1665 (140.7816) masked_loss: 1.4903 (1.5185) tag_loss: 139.6264 (139.2631) time: 1.4334 (1.8322) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4286 (1.8271) save_time: 8.8805 (16.0781) lr: 0.000031 max mem: 26307 2022-03-17 02:46:43,509.509 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.625 2022-03-17 02:46:43,509.509 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 127.8098373413086 2022-03-17 02:46:43,510.510 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.22112145631209 2022-03-17 02:47:06,789.789 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02197863534092903 2022-03-17 02:47:06,789.789 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 02:47:06,790.790 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'of', 'chocolate', 'layer', 'cake', 'with', 'american', 'flag', 'planted', 'in', 'it', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 02:47:06,805.805 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['cake', 'flag', 'table', 'spoon', '[UNK]', 'plate', 'handle', 'cloth', 'stick', 'fork', 'reflection', 'star', 'bowl', 'light', 'top', 'flower', 'piece', 'candle', 'bread', 'stripe', 'design', 'stem', 'cream', 'crust', 'glass', 'american', 'paper', 'cup', 'layer', 'water', 'dessert', 'small', 'slice', 'food', 'person', 'leaf', 'shadow', 'next', 'napkin', 'hand', 'white', 'tea', 'sugar', 'object', 'blade', 'close', 'ball', 'ice', 'bottle', 'coffee'] 2022-03-17 02:47:22,691.691 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'american', 'table', 'handle', 'plate', 'flag', 'chocolate', 'reflection', 'cake', 'spoon', 'crust'] 2022-03-17 02:49:46,215.215 2829:trainer.py:487 do_train_dict(): eta: 9:47:15 iter: 46000 speed: 279.7 images/sec total_norm: 146.9878 (148.8675) loss: 142.6540 (141.9819) masked_loss: 1.4465 (1.4693) tag_loss: 141.2993 (140.5126) time: 1.4325 (1.8307) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4272 (1.8255) save_time: 8.8805 (16.0781) lr: 0.000031 max mem: 26307 2022-03-17 02:49:46,576.576 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6388888955116272 2022-03-17 02:49:46,576.576 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 141.797607421875 2022-03-17 02:49:46,576.576 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.22365999532109 2022-03-17 02:50:09,888.888 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02198127843439579 2022-03-17 02:50:09,888.888 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 02:50:09,889.889 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'is', 'open', 'on', 'a', 'desk', '[MASK]', 'dim', 'light', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 02:50:09,904.904 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['lamp', 'table', 'window', 'chair', 'wall', 'box', 'desk', 'shade', 'plate', 'keyboard', 'blind', 'room', 'floor', 'knife', 'cup', 'computer', '[UNK]', 'screen', 'paper', 'glass', 'lid', 'laptop', 'remote', 'monitor', 'light', 'bottle', 'mouse', 'water', 'can', 'napkin', 'speaker', 'pillow', 'book', 'mug', 'cd', 'printer', 'television', 'control', 'bowl', 'bag', 'car', 'guitar', 'coffee', 'cushion', 'handle', 'shirt', 'person', 'shelf', 'stack', 'phone'] 2022-03-17 02:50:25,894.894 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'can', 'room', 'open', 'door', 'light', 'cup', 'floor', 'table', 'wall', 'glass', 'chair', 'paper', 'computer', 'window', 'box', 'screen', 'coffee', 'desk', 'plate', 'knife', 'bottle', 'blind', 'couch', 'remote', 'mouse', 'monitor', 'shade', 'keyboard', 'lamp', 'sofa', 'dim', 'laptop', 'rack', 'mug', 'soda', 'ledge'] 2022-03-17 02:52:49,871.871 2829:trainer.py:487 do_train_dict(): eta: 9:44:28 iter: 46100 speed: 278.8 images/sec total_norm: 147.9836 (150.9202) loss: 139.2249 (141.8203) masked_loss: 1.4920 (1.5302) tag_loss: 137.3647 (140.2902) time: 1.4340 (1.8366) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4290 (1.8315) save_time: 8.8805 (16.0781) lr: 0.000031 max mem: 26307 2022-03-17 02:52:50,233.233 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7941176295280457 2022-03-17 02:52:50,233.233 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 110.102294921875 2022-03-17 02:52:50,234.234 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.23229049913812 2022-03-17 02:53:13,558.558 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022016631439328194 2022-03-17 02:53:13,558.558 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 02:53:13,559.559 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'climbs', 'a', 'ladder', 'up', '[MASK]', 'a', 'si', '##lo', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 02:53:13,574.574 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['man', 'truck', 'beam', 'shirt', '[UNK]', 'pole', 'sky', 'building', 'ramp', 'ladder', 'hat', 'windshield', 'window', 'post', 'head', 'roof', 'tire', 'wheel', 'jean', 'front', 'wall', 'ground', 'structure', 'belt', 'door', 'stair', 'cab', 'snow', 'person', 'license', 'light', 'mirror', 'sign', 'van', 'railing', 'plate', 'hair', 'pillar', 'jacket', 'ceiling', 'vehicle', 'white', 'leg', 'road', 'next', 'bumper', 'camera', 'cap', 'logo', 'grass'] 2022-03-17 02:53:29,416.416 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'house', 'building', 'road', 'front', 'light', 'ground', 'post', 'wall', 'window', 'jean', 'shirt', 'roof', 'truck', 'plate', 'wheel', 'belt', 'ceiling', 'hat', 'license', 'pole', 'beam', 'fence', 'pipe', 'steering', 'ladder', 'ramp', 'pillar', 'railing', 'windshield', 'bumper', 'stair'] 2022-03-17 02:55:53,447.447 2829:trainer.py:487 do_train_dict(): eta: 9:41:40 iter: 46200 speed: 278.9 images/sec total_norm: 147.2830 (150.2882) loss: 141.3240 (141.0717) masked_loss: 1.5054 (1.4939) tag_loss: 139.6562 (139.5778) time: 1.4327 (1.8357) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4274 (1.8305) save_time: 8.8805 (16.0781) lr: 0.000030 max mem: 26307 2022-03-17 02:55:53,812.812 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7058823704719543 2022-03-17 02:55:53,813.813 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 104.05728149414062 2022-03-17 02:55:53,813.813 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.24662524526114 2022-03-17 02:56:17,044.044 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022092510014772415 2022-03-17 02:56:17,044.044 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 02:56:17,044.044 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'rip', 'is', '[MASK]', 'in', 'coasts', 'with', 'batting', 'exposed', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 02:56:17,060.060 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['flower', 'umbrella', 'bed', 'design', 'blanket', 'banana', 'pillow', 'handle', 'yellow', 'vase', '[UNK]', 'top', 'tie', 'table', 'cloth', 'blue', 'colorful', 'white', 'paper', 'window', 'fish', 'light', 'butterfly', 'shadow', 'black', 'floral', 'ball', 'tag', 'towel', 'eye', 'bag', 'scissors', 'purple', 'small', 'leaf', 'wall', 'star', 'dot', 'bear', 'fabric', 'sheet', 'other', 'bunch', 'ear', 'material', 'pair', 'number', 'button', 'circle', 'animal'] 2022-03-17 02:56:32,938.938 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['design', 'bed', 'circle', 'flower', 'fabric', 'batting', 'rip', 'umbrella', 'banana'] 2022-03-17 02:58:56,846.846 2829:trainer.py:487 do_train_dict(): eta: 9:38:52 iter: 46300 speed: 279.2 images/sec total_norm: 146.1909 (149.5816) loss: 140.9943 (142.9112) masked_loss: 1.4123 (1.4693) tag_loss: 139.6194 (141.4419) time: 1.4314 (1.8340) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4263 (1.8289) save_time: 8.8805 (16.0781) lr: 0.000030 max mem: 26307 2022-03-17 02:58:57,207.207 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.47058823704719543 2022-03-17 02:58:57,208.208 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 154.2015380859375 2022-03-17 02:58:57,208.208 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.24297504589475 2022-03-17 02:59:20,480.480 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022096989676356316 2022-03-17 02:59:20,480.480 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 02:59:20,480.480 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'lit', 'toast', '##er', 'but', '[MASK]', 'toast', '[MASK]', '[MASK]', 'seated', 'next', 'to', 'a', 'microwave', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 02:59:20,496.496 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'window', '[UNK]', 'microwave', 'oven', 'rack', 'knob', 'towel', 'door', 'chair', 'person', 'reflection', 'handle', 'kitchen', 'button', 'light', 'dial', 'cord', 'panel', 'table', 'glass', 'tile', 'outlet', 'plate', 'food', 'cabinet', 'bag', 'top', 'display', 'room', 'cloth', 'leg', 'counter', 'container', 'shelf', 'mirror', 'floor', 'control', 'sink', 'paper', 'bowl', 'cup', 'pot', 'box', 'tray', 'bottle', 'black', 'stove', 'hair', 'lid'] 2022-03-17 02:59:36,486.486 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'next', 'door', 'light', 'person', 'wall', 'chair', 'window', 'metal', 'kitchen', 'empty', 'bag', 'lit', 'plastic', 'button', 'reflection', 'cord', 'dial', 'tile', 'rack', 'knob', 'oven', 'microwave'] 2022-03-17 03:02:00,495.495 2829:trainer.py:487 do_train_dict(): eta: 9:36:04 iter: 46400 speed: 278.8 images/sec total_norm: 149.7223 (152.0181) loss: 144.1093 (145.2236) masked_loss: 1.4588 (1.4851) tag_loss: 143.1072 (143.7384) time: 1.4341 (1.8365) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4288 (1.8313) save_time: 8.8805 (16.0781) lr: 0.000030 max mem: 26307 2022-03-17 03:02:00,855.855 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5405405163764954 2022-03-17 03:02:00,856.856 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 115.45228576660156 2022-03-17 03:02:00,856.856 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.25549110494634 2022-03-17 03:02:24,091.091 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02210330218076706 2022-03-17 03:02:24,091.091 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 03:02:24,092.092 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'girl', 'is', 'holding', '[MASK]', 'game', '[MASK]', 'above', 'her', 'head', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 03:02:24,107.107 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'lamp', 'hand', 'sweater', 'hair', 'jean', 'woman', '[UNK]', 'face', 'shade', 'bottle', 'table', 'chair', 'door', 'head', 'picture', 'shirt', 'arm', 'controller', 'room', 'remote', 'cup', 'couch', 'man', 'ceiling', 'mouth', 'nose', 'switch', 'glasses', 'light', 'eye', 'floor', 'ear', 'book', 'cabinet', 'pillow', 'game', 'living', 'girl', 'beard', 'window', 'plate', 'toy', 'belt', 'person', 'phone', 'box', 'shelf', 'glass', 'frame'] 2022-03-17 03:02:40,014.014 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'hand', 'game', 'face', 'room', 'door', 'woman', 'cup', 'heart', 'living', 'hair', 'girl', 'person', 'table', 'wall', 'character', 'chair', 'jean', 'shirt', 'picture', 'cabinet', 'bottle', 'couch', 'remote', 'switch', 'glasses', 'shade', 'beard', 'lamp', 'ribbon', 'controller', 'container', 'sweater', 'strap'] 2022-03-17 03:05:03,765.765 2829:trainer.py:487 do_train_dict(): eta: 9:33:16 iter: 46500 speed: 279.4 images/sec total_norm: 145.6473 (146.9723) loss: 140.5860 (142.6067) masked_loss: 1.4246 (1.4212) tag_loss: 139.1850 (141.1855) time: 1.4327 (1.8326) data: 0.0001 (0.0005) to_device: 0.0051 (0.0048) time_gpu: 1.4274 (1.8273) save_time: 8.8805 (16.0781) lr: 0.000030 max mem: 26307 2022-03-17 03:05:04,126.126 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5 2022-03-17 03:05:04,126.126 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 163.01873779296875 2022-03-17 03:05:04,126.126 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.25642693810197 2022-03-17 03:05:27,285.285 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022087277844548225 2022-03-17 03:05:27,285.285 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 03:05:27,286.286 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'woman', 'looking', 'up', 'at', 'a', 'street', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 03:05:27,301.301 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sign', 'building', 'flower', 'wall', 'pole', 'tree', 'window', 'ground', 'head', '[UNK]', 'bush', 'hair', 'letter', 'plant', 'shirt', 'woman', 'post', 'face', 'handle', 'sky', 'hat', 'chair', 'pot', 'house', 'girl', 'toy', 'person', 'arm', 'statue', 'leg', 'vase', 'scarf', 'hand', 'shadow', 'shoe', 'sidewalk', 'light', 'stripe', 'mouth', 'umbrella', 'leaf', 'jacket', 'door', 'doll', 'top', 'decoration', 'short', 'man', 'fence', 'balloon'] 2022-03-17 03:05:43,213.213 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'hand', 'face', 'building', 'street', 'woman', 'ground', 'hair', 'wall', 'arm', 'base', 'paper', 'plant', 'ball', 'letter', 'sign', 'jean', 'handle', 'dragon', 'coat', 'bush', 'pole', 'flower', 'jacket', 'bow', 'drum', 'glasses', 'pot', 'ribbon', 'shoe', 'decoration', 'container', 'sidewalk', 'stripe', 'vase', 'scarf'] 2022-03-17 03:08:07,434.434 2829:trainer.py:487 do_train_dict(): eta: 9:30:28 iter: 46600 speed: 278.8 images/sec total_norm: 147.1479 (149.2539) loss: 140.7175 (140.9012) masked_loss: 1.3701 (1.4033) tag_loss: 139.2201 (139.4979) time: 1.4328 (1.8367) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4278 (1.8316) save_time: 8.8805 (16.0781) lr: 0.000030 max mem: 26307 2022-03-17 03:08:07,794.794 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6060606241226196 2022-03-17 03:08:07,795.795 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 147.6370849609375 2022-03-17 03:08:07,795.795 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.26492570145993 2022-03-17 03:08:31,569.569 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022109992802143097 2022-03-17 03:08:31,569.569 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 03:08:31,570.570 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'hot', 'dogs', '[MASK]', 'in', 'chili', 'and', 'sour', 'k', '##ra', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 03:08:31,585.585 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['dog', 'hot', 'paper', 'table', 'bun', 'mustard', '[UNK]', 'napkin', 'onion', 'food', 'cheese', 'foil', 'tray', 'topping', 'light', 'corn', 'plate', 'chili', 'container', 'sauce', 'sandwich', 'line', 'handle', 'bag', 'drink', 'floor', 'top', 'olive', 'wall', 'next', 'red', 'fork', 'white', 'plastic', 'straw', 'pepper', 'end', 'tin', 'glass', 'tomato', 'bread', 'meat', 'chip', 'rice', 'bottom', 'candy', 'close', 'pizza', 'letter', 'can'] 2022-03-17 03:08:47,513.513 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'line', 'table', 'hot', 'dog', 'handle', 'cheese', 'towel', 'tray', 'lid', 'sauce', 'bean', 'sour', 'chili', 'napkin', 'onion', 'bun', 'mustard'] 2022-03-17 03:11:11,100.100 2829:trainer.py:487 do_train_dict(): eta: 9:27:40 iter: 46700 speed: 278.8 images/sec total_norm: 146.7118 (151.3979) loss: 141.5708 (141.8499) masked_loss: 1.4505 (1.4997) tag_loss: 139.6504 (140.3502) time: 1.4330 (1.8367) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4280 (1.8316) save_time: 8.8805 (16.0781) lr: 0.000030 max mem: 26307 2022-03-17 03:11:11,460.460 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6363636255264282 2022-03-17 03:11:11,461.461 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 122.97488403320312 2022-03-17 03:11:11,461.461 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.2754523040902 2022-03-17 03:11:34,876.876 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022123076021671295 2022-03-17 03:11:34,876.876 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 03:11:34,876.876 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'old', 'truck', 'sitting', 'in', '[MASK]', '[MASK]', 'house', 'with', 'trees', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 03:11:34,892.892 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'truck', 'grill', 'windshield', '[UNK]', 'snow', 'plate', 'tire', 'bumper', 'ground', 'hood', 'license', 'sky', 'mirror', 'road', 'window', 'door', 'wheel', 'front', 'bush', 'building', 'snowy', 'light', 'logo', 'house', 'white', 'blue', 'shadow', 'roof', 'puddle', 'wood', 'car', 'handle', 'street', 'lot', 'number', 'forest', 'step', 'rim', 'fence', 'top', 'next', 'pine', 'old', 'parking', 'side', 'sign', 'trailer', 'pole', 'small'] 2022-03-17 03:11:50,809.809 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'house', 'road', 'front', 'ground', 'window', 'tree', 'sky', 'snow', 'truck', 'plate', 'shadow', 'mirror', 'license', 'hood', 'tire', 'grill', 'windshield', 'bumper'] 03-17 03:13:12.449 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-17 03:13:12.449 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-17 03:13:13.585 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-17 03:14:14,738.738 2829:trainer.py:487 do_train_dict(): eta: 9:24:51 iter: 46800 speed: 278.8 images/sec total_norm: 144.6523 (146.5960) loss: 140.8032 (142.8660) masked_loss: 1.4183 (1.4734) tag_loss: 139.4661 (141.3926) time: 1.4327 (1.8364) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4276 (1.8311) save_time: 8.8805 (16.0781) lr: 0.000030 max mem: 26307 2022-03-17 03:14:15,103.103 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6176470518112183 2022-03-17 03:14:15,104.104 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 169.4796905517578 2022-03-17 03:14:15,104.104 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.27002097091187 2022-03-17 03:14:38,810.810 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022122416645288467 2022-03-17 03:14:38,810.810 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 03:14:38,811.811 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'and', 'white', 'picture', 'of', 'chefs', 'cooking', 'in', 'a', 'kitchen', '.', 'sharpened', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 03:14:38,826.826 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['man', 'shirt', 'kitchen', 'head', '[UNK]', 'ceiling', 'bowl', 'light', 'table', 'wall', 'person', 'chef', 'food', 'watch', 'plate', 'shelf', 'hair', 'apron', 'pot', 'bottle', 'hand', 'ear', 'container', 'cup', 'pan', 'spoon', 'uniform', 'vent', 'handle', 'restaurant', 'tray', 'hat', 'woman', 'lid', 'door', 'group', 'arm', 'cutting', 'napkin', 'glove', 'floor', 'logo', 'hood', 'towel', 'dish', 'window', 'bag', 'photo', 'white', 'suit'] 2022-03-17 03:14:54,741.741 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'black', 'white', 'light', 'hair', 'person', 'table', 'wall', 'food', 'watch', 'shirt', 'kitchen', 'picture', 'ear', 'bowl', 'plate', 'bottle', 'ceiling', 'cap', 'sink', 'bread', 'chef', 'shelf', 'container', 'lid', 'refrigerator', 'apron', 'kettle'] 2022-03-17 03:17:18,255.255 2829:trainer.py:487 do_train_dict(): eta: 9:22:03 iter: 46900 speed: 279.0 images/sec total_norm: 148.8805 (151.5371) loss: 139.0197 (139.6487) masked_loss: 1.4521 (1.4627) tag_loss: 137.9297 (138.1860) time: 1.4324 (1.8352) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4271 (1.8301) save_time: 8.8805 (16.0781) lr: 0.000029 max mem: 26307 2022-03-17 03:17:18,617.617 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.529411792755127 2022-03-17 03:17:18,617.617 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 131.20040893554688 2022-03-17 03:17:18,617.617 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.28087819687863 2022-03-17 03:17:42,336.336 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022115055471658707 2022-03-17 03:17:42,337.337 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 03:17:42,337.337 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'some', 'players', '[MASK]', 'action', 'in', 'a', '[MASK]', 'game', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 03:17:42,352.352 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'helmet', 'grass', 'dirt', 'catcher', 'shoe', 'field', 'glove', 'shirt', 'fence', 'leg', 'bat', 'plate', 'mask', 'man', 'uniform', 'player', 'belt', 'ground', 'home', 'batter', 'jersey', 'umpire', 'head', 'hand', 'baseball', 'person', 'number', 'camera', 'line', 'face', 'game', 'arm', 'shin', 'hat', 'chair', 'sign', 'ball', 'banner', 'cooler', 'guard', 'stand', 'pad', 'railing', 'ready', 'towel', 'guards', 'woman', 'name', 'band'] 2022-03-17 03:17:58,232.232 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'home', 'hand', 'number', 'game', 'face', 'band', 'player', 'field', 'ground', 'arm', 'action', 'baseball', 'sign', 'shirt', 'jersey', 'leg', 'camera', 'plate', 'grass', 'belt', 'uniform', 'dirt', 'bat', 'mask', 'fence', 'banner', 'helmet', 'shoe', 'catcher', 'glove', 'umpire', 'batter'] 2022-03-17 03:20:22,027.027 2829:trainer.py:487 do_train_dict(): eta: 9:19:15 iter: 47000 speed: 278.6 images/sec total_norm: 148.0130 (150.1703) loss: 141.1111 (140.8762) masked_loss: 1.4546 (1.4666) tag_loss: 139.3108 (139.4095) time: 1.4342 (1.8377) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4287 (1.8325) save_time: 8.8805 (16.0781) lr: 0.000029 max mem: 26307 2022-03-17 03:20:22,389.389 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6666666865348816 2022-03-17 03:20:22,389.389 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 168.59767150878906 2022-03-17 03:20:22,390.390 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.27458203066686 2022-03-17 03:20:46,083.083 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022139830514788628 2022-03-17 03:20:46,083.083 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 03:20:46,084.084 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'train', 'cars', '[MASK]', 'on', '[MASK]', 'tracks', 'next', 'to', 'a', 'platform', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 03:20:46,099.099 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['window', 'sky', 'train', 'tree', 'platform', 'door', 'letter', 'number', 'roof', 'windshield', 'boat', 'handle', 'building', 'railing', 'track', 'bumper', 'cloud', 'rail', 'car', 'logo', 'vent', 'light', '[UNK]', 'line', 'pole', 'step', 'blue', 'front', 'bench', 'stair', 'seat', 'sign', 'person', 'mirror', 'bus', 'next', 'station', 'stripe', 'chair', 'gravel', 'top', 'fence', 'white', 'post', 'flag', 'wire', 'lot', 'background', 'man', 'side'] 2022-03-17 03:21:02,050.050 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'number', 'building', 'door', 'car', 'window', 'train', 'tree', 'letter', 'machine', 'sky', 'platform', 'roof', 'handle', 'cloud', 'railing', 'bumper'] 2022-03-17 03:23:25,859.859 2829:trainer.py:487 do_train_dict(): eta: 9:16:26 iter: 47100 speed: 278.5 images/sec total_norm: 148.5389 (149.7176) loss: 141.6600 (142.7763) masked_loss: 1.4648 (1.4956) tag_loss: 140.3538 (141.2807) time: 1.4338 (1.8383) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4286 (1.8331) save_time: 8.8805 (16.0781) lr: 0.000029 max mem: 26307 2022-03-17 03:23:26,221.221 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-17 03:23:26,221.221 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 161.4774169921875 2022-03-17 03:23:26,221.221 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.28039753639092 2022-03-17 03:23:50,048.048 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02217692881822586 2022-03-17 03:23:50,048.048 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 03:23:50,049.049 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'party', '[MASK]', 'four', 'standing', 'at', '[MASK]', 'tennis', 'net', 'one', 'man', '[MASK]', 'wearing', 'a', 'costume', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 03:23:50,064.064 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['man', 'shirt', 'hair', 'fence', 'tree', '[UNK]', 'tennis', 'court', 'pole', 'hand', 'shoe', 'ground', 'head', 'line', 'leg', 'short', 'sock', 'arm', 'grass', 'jean', 'ball', 'person', 'handle', 'beard', 'net', 'bench', 'bush', 'face', 'sky', 'game', 'wall', 'shadow', 'hat', 'street', 'sidewalk', 'cap', 'guy', 'car', 'sign', 'sunglasses', 'glasses', 'couple', 'young', 'road', 'backpack', 'logo', 'ear', 'watch', 'boy', 'roof'] 2022-03-17 03:24:06,080.080 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'line', 'party', 'woman', 'court', 'short', 'hair', 'person', 'standing', 'foot', 'tree', 'ball', 'jean', 'shirt', 'leg', 'tennis', 'coat', 'grass', 'belt', 'net', 'jacket', 'fence', 'collar', 'costume'] 2022-03-17 03:26:29,735.735 2829:trainer.py:487 do_train_dict(): eta: 9:13:38 iter: 47200 speed: 278.5 images/sec total_norm: 147.4293 (148.8538) loss: 141.5915 (142.9183) masked_loss: 1.4124 (1.4379) tag_loss: 140.2686 (141.4804) time: 1.4344 (1.8388) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4292 (1.8336) save_time: 8.8805 (16.0781) lr: 0.000029 max mem: 26307 2022-03-17 03:26:30,096.096 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-17 03:26:30,097.097 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 168.59213256835938 2022-03-17 03:26:30,097.097 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.27672050266386 2022-03-17 03:26:53,820.820 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02217678911983967 2022-03-17 03:26:53,820.820 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 03:26:53,820.820 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'room', 'with', 'a', 'tv', 'and', 'no', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 03:26:53,836.836 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['floor', 'wall', 'light', 'ceiling', 'room', 'box', 'door', '[UNK]', 'kitchen', 'cabinet', 'bag', 'chair', 'shelf', 'television', 'outlet', 'switch', 'table', 'living', 'refrigerator', 'carpet', 'couch', 'column', 'book', 'sofa', 'lid', 'shirt', 'fire', 'vent', 'suitcase', 'trash', 'towel', 'pillow', 'handle', 'pillar', 'window', 'microwave', 'hallway', 'man', 'picture', 'toy', 'leg', 'drawer', 'hood', 'pot', 'stool', 'flower', 'tile', 'person', 'can', 'cardboard'] 2022-03-17 03:27:09,757.757 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'room', 'book', 'door', 'light', 'television', 'ground', 'tv', 'floor', 'wall', 'plant', 'gun', 'box', 'kitchen', 'screen', 'bag', 'handle', 'ceiling', 'flower', 'hallway', 'wire', 'furniture', 'pot', 'closet', 'carpet', 'towel', 'shoe', 'shelf', 'cord', 'lid', 'mat', 'refrigerator', 'vase', 'rug'] 2022-03-17 03:29:33,763.763 2829:trainer.py:487 do_train_dict(): eta: 9:10:50 iter: 47300 speed: 278.2 images/sec total_norm: 146.5983 (148.3750) loss: 133.8196 (135.6978) masked_loss: 1.5059 (1.5127) tag_loss: 132.5114 (134.1851) time: 1.4333 (1.8402) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4282 (1.8350) save_time: 8.8805 (16.0781) lr: 0.000029 max mem: 26307 2022-03-17 03:29:34,124.124 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7878788113594055 2022-03-17 03:29:34,124.124 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 131.19818115234375 2022-03-17 03:29:34,124.124 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.2912524340022 2022-03-17 03:29:57,680.680 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022175565361976624 2022-03-17 03:29:57,680.680 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 03:29:57,681.681 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'guys', 'jumping', 'for', 'a', 'fr', '##is', '##bee', 'while', 'others', 'watch', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 03:29:57,696.696 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'man', 'short', 'grass', 'shoe', 'tree', 'sock', 'hand', '[UNK]', 'arm', 'sunglasses', 'ground', 'hat', 'fence', 'boy', 'belt', 'field', 'cap', 'head', 'leg', 'person', 'shadow', 'hair', 'group', 'air', 'park', 'cone', 'watch', 'glasses', 'number', 'car', 'vest', 'design', 'game', 'other', 'face', 'bag', 'stripe', 'young', 'woman', 'pole', 'knee', 'couple', 'sign', 'trunk', 'grassy', 'pad', 'back', 'sidewalk', 'glove'] 2022-03-17 03:30:13,627.627 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'hand', 'air', 'short', 'field', 'ground', 'person', 'arm', 'boy', 'tree', 'shirt', 'leg', 'shadow', 'grass', 'belt', 'hat', 'fence', 'shoe', 'sunglasses', 'sock'] 2022-03-17 03:32:37,620.620 2829:trainer.py:487 do_train_dict(): eta: 9:08:01 iter: 47400 speed: 278.5 images/sec total_norm: 147.6856 (153.0433) loss: 138.0872 (138.9916) masked_loss: 1.4173 (1.4225) tag_loss: 136.7552 (137.5691) time: 1.4341 (1.8385) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4288 (1.8334) save_time: 8.8805 (16.0781) lr: 0.000029 max mem: 26307 2022-03-17 03:32:37,981.981 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5714285969734192 2022-03-17 03:32:37,981.981 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 136.08389282226562 2022-03-17 03:32:37,981.981 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.29869244224147 2022-03-17 03:33:01,925.925 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022162389010190964 2022-03-17 03:33:01,925.925 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 03:33:01,925.925 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'baseball', 'batter', ',', 'catcher', ',', 'and', 'umpire', 'get', 'ready', 'for', 'the', 'pitch', '[MASK]', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 03:33:01,941.941 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['uniform', 'belt', 'man', '[UNK]', 'shirt', 'player', 'jersey', 'head', 'bat', 'line', 'helmet', 'field', 'glove', 'grass', 'baseball', 'catcher', 'shoe', 'hat', 'hand', 'strap', 'leg', 'back', 'mask', 'arm', 'umpire', 'number', 'cap', 'batter', 'plate', 'dirt', 'logo', 'home', 'net', 'patch', 'name', 'hair', 'shin', 'game', 'guard', 'ball', 'pitch', 'shoulder', 'base', 'ground', 'stripe', 'sock', 'sleeve', 'pole', 'ready', 'guards'] 2022-03-17 03:33:17,887.887 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'back', 'get', 'head', 'man', 'name', 'hand', 'line', 'player', 'field', 'hair', 'arm', 'ready', 'baseball', 'shirt', 'jersey', 'leg', 'grass', 'belt', 'hat', 'cap', 'uniform', 'pitch', 'bat', 'mask', 'patch', 'helmet', 'shoe', 'catcher', 'glove', 'strap', 'bracelet', 'umpire', 'batter'] 2022-03-17 03:35:41,614.614 2829:trainer.py:487 do_train_dict(): eta: 9:05:13 iter: 47500 speed: 278.3 images/sec total_norm: 146.3037 (150.0780) loss: 139.3985 (141.8122) masked_loss: 1.4417 (1.4579) tag_loss: 137.9291 (140.3543) time: 1.4319 (1.8400) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4267 (1.8349) save_time: 8.8805 (16.0781) lr: 0.000028 max mem: 26307 2022-03-17 03:35:41,974.974 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6764705777168274 2022-03-17 03:35:41,975.975 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 138.34457397460938 2022-03-17 03:35:41,975.975 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.30472671284394 2022-03-17 03:36:05,709.709 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02216068096458912 2022-03-17 03:36:05,710.710 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 03:36:05,710.710 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'is', 'swinging', 'a', 'tennis', 'rack', '##et', 'on', 'a', 'court', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 03:36:05,725.725 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['leg', '[UNK]', 'hand', 'tennis', 'court', 'shoe', 'hair', 'arm', 'short', 'head', 'shirt', 'handle', 'player', 'woman', 'man', 'wall', 'nose', 'ground', 'ball', 'face', 'mouth', 'sock', 'ear', 'shadow', 'logo', 'ponytail', 'line', 'skirt', 'person', 'dress', 'stripe', 'letter', 'foot', 'chair', 'band', 'outfit', 'sign', 'string', 'top', 'eye', 'floor', 'wrist', 'banner', 'hat', 'watch', 'cap', 'necklace', 'girl', 'cooler', 'female'] 2022-03-17 03:36:21,612.612 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'line', 'player', 'woman', 'court', 'ground', 'hair', 'mouth', 'wall', 'arm', 'stand', 'chair', 'ball', 'letter', 'sky', 'shirt', 'platform', 'leg', 'dress', 'tennis', 'hat', 'cap', 'logo', 'shoe', 'outfit', 'sunglasses', 'bracelet', 'sock'] 2022-03-17 03:38:45,842.842 2829:trainer.py:487 do_train_dict(): eta: 9:02:24 iter: 47600 speed: 277.9 images/sec total_norm: 148.3192 (152.9970) loss: 138.2993 (140.6993) masked_loss: 1.5422 (1.5453) tag_loss: 136.1622 (139.1540) time: 1.4325 (1.8423) data: 0.0001 (0.0005) to_device: 0.0051 (0.0050) time_gpu: 1.4273 (1.8367) save_time: 8.8805 (16.0781) lr: 0.000028 max mem: 26307 2022-03-17 03:38:46,202.202 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4117647111415863 2022-03-17 03:38:46,202.202 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 140.47686767578125 2022-03-17 03:38:46,202.202 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.30535854283619 2022-03-17 03:39:10,270.270 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022198941558599472 2022-03-17 03:39:10,271.271 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 03:39:10,271.271 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'the', 'old', 'fire', 'hydra', '##nt', 'with', 'chip', '##ped', '[MASK]', 'has', 'rust', 'alpine', 'it', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 03:39:10,286.286 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'fire', 'cap', 'shirt', 'bolt', 'tree', 'man', 'top', 'wheel', 'ground', 'person', 'trunk', 'arm', 'face', 'paint', 'hat', 'hair', 'building', 'old', 'woman', 'fence', 'tag', 'branch', 'background', 'sky', 'dirt', 'hand', 'cart', 'wood', 'head', 'wall', 'knob', 'leaf', 'eye', 'pole', 'yellow', 'base', 'jacket', 'sidewalk', 'blue', 'plant', 'jean', 'side', 'grass', 'shoe', 'window', 'number', 'next', 'wooden', 'writing'] 2022-03-17 03:39:26,209.209 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'old', 'face', 'body', 'top', 'fire', 'ground', 'person', 'eye', 'tree', 'shirt', 'wheel', 'hat', 'cap', 'paint', 'bolt', 'knob'] 2022-03-17 03:41:49,875.875 2829:trainer.py:487 do_train_dict(): eta: 8:59:36 iter: 47700 speed: 278.2 images/sec total_norm: 145.2389 (147.5624) loss: 138.6443 (139.9719) masked_loss: 1.4503 (1.4740) tag_loss: 137.1254 (138.4979) time: 1.4330 (1.8404) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4277 (1.8352) save_time: 8.8805 (16.0781) lr: 0.000028 max mem: 26307 2022-03-17 03:41:50,237.237 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6764705777168274 2022-03-17 03:41:50,237.237 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 152.892333984375 2022-03-17 03:41:50,237.237 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.31035914879962 2022-03-17 03:42:14,059.059 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02220848947763443 2022-03-17 03:42:14,059.059 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 03:42:14,060.060 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'cat', 'on', 'a', 'table', 'chewing', 'the', 'edge', 'of', 'a', 'book', '[MASK]', 'is', 'lying', 'beside', 'it', '[MASK]', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 03:42:14,075.075 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['cat', 'head', 'ear', 'table', 'nose', 'book', 'face', 'eye', 'man', 'writing', 'shirt', '[UNK]', 'hair', 'paw', 'leg', 'floor', 'mouth', 'cord', 'kitten', 'letter', 'word', 'chair', 'carpet', 'magazine', 'orange', 'desk', 'cover', 'marker', 'hand', 'pen', 'girl', 'next', 'paper', 'photo', 'wooden', 'tail', 'wire', 'glasses', 'finger', 'line', 'image', 'dot', 'nail', 'white', 'handle', 'button', 'top', 'shadow', 'picture', 'straw'] 2022-03-17 03:42:29,945.945 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'face', 'book', 'mouth', 'floor', 'word', 'table', 'writing', 'eye', 'edge', 'letter', 'shirt', 'leg', 'nose', 'ear', 'desk', 'object', 'cat', 'clothing', 'carpet', 'dot'] 03-17 03:43:13.611 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-17 03:43:13.611 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-17 03:43:14.850 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 90}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}] 2022-03-17 03:44:54,011.011 2829:trainer.py:487 do_train_dict(): eta: 8:56:47 iter: 47800 speed: 278.1 images/sec total_norm: 146.9491 (151.2876) loss: 137.7076 (140.0496) masked_loss: 1.4431 (1.4877) tag_loss: 136.3362 (138.5619) time: 1.4329 (1.8413) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4276 (1.8362) save_time: 8.8805 (16.0781) lr: 0.000028 max mem: 26307 2022-03-17 03:44:54,372.372 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6764705777168274 2022-03-17 03:44:54,372.372 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 141.2831573486328 2022-03-17 03:44:54,372.372 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.3158869494476 2022-03-17 03:45:18,415.415 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02220868691802025 2022-03-17 03:45:18,415.415 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 03:45:18,415.415 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'person', 'holding', 'a', 'broken', 'cell', 'phone', 'while', 'looking', 'at', 'the', '[MASK]', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 03:45:18,431.431 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hand', 'phone', 'finger', 'person', 'cell', 'screen', '[UNK]', 'nail', 'thumb', 'button', 'face', 'table', 'logo', 'smart', 'woman', 'camera', 'picture', 'light', 'man', 'reflection', 'palm', 'device', 'eye', 'shirt', 'glasses', 'shadow', 'bowl', 'background', 'cord', 'wall', 'speaker', 'ring', 'lip', 'iphone', 'key', 'ear', 'head', 'hair', 'rim', 'glass', 'small', 'electronic', 'close', 'floor', 'cloth', 'screw', 'front', 'handle', 'room', 'base'] 2022-03-17 03:45:34,323.323 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['hand', 'person', 'phone', 'paper', 'cell', 'broken', 'screen', 'finger', 'camera', 'thumb'] 2022-03-17 03:47:58,202.202 2829:trainer.py:487 do_train_dict(): eta: 8:53:58 iter: 47900 speed: 278.0 images/sec total_norm: 147.0159 (151.2230) loss: 140.2948 (140.2809) masked_loss: 1.3591 (1.4616) tag_loss: 138.4697 (138.8192) time: 1.4328 (1.8419) data: 0.0001 (0.0001) to_device: 0.0051 (0.0051) time_gpu: 1.4277 (1.8367) save_time: 8.8805 (16.0781) lr: 0.000028 max mem: 26307 2022-03-17 03:47:58,562.562 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5 2022-03-17 03:47:58,562.562 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 158.79212951660156 2022-03-17 03:47:58,563.563 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.32025716304778 2022-03-17 03:48:22,818.818 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022234836593270302 2022-03-17 03:48:22,818.818 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 03:48:22,819.819 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'black', 'cat', 'sits', 'on', 'a', 'rug', 'with', 'a', 'red', 'cord', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 03:48:22,834.834 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['floor', 'wall', 'rug', 'head', 'ear', 'cat', 'paw', 'tile', 'mat', 'door', 'dog', 'eye', 'nose', 'leg', 'bag', 'face', '[UNK]', 'tail', 'line', 'carpet', 'black', 'collar', 'shoe', 'leash', 'handle', 'room', 'tag', 'stripe', 'cabinet', 'kitchen', 'refrigerator', 'clothes', 'ground', 'foot', 'knob', 'cord', 'outlet', 'person', 'container', 'box', 'towel', 'strap', 'bed', 'next', 'can', 'bathroom', 'light', 'jean', 'small', 'chair'] 2022-03-17 03:48:38,854.854 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'room', 'black', 'door', 'red', 'floor', 'wall', 'eye', 'nose', 'ear', 'cat', 'tag', 'bow', 'tape', 'ribbon', 'cord', 'mat', 'tile', 'rug', 'paw', 'leash'] 2022-03-17 03:51:02,751.751 2829:trainer.py:487 do_train_dict(): eta: 8:51:10 iter: 48000 speed: 277.4 images/sec total_norm: 147.9644 (150.7946) loss: 141.1832 (139.7102) masked_loss: 1.4822 (1.4622) tag_loss: 139.7764 (138.2479) time: 1.4342 (1.8455) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4290 (1.8402) save_time: 8.8805 (16.0781) lr: 0.000028 max mem: 26307 2022-03-17 03:51:03,112.112 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5142857432365417 2022-03-17 03:51:03,112.112 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 128.7156219482422 2022-03-17 03:51:03,113.113 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.324152440886 2022-03-17 03:51:27,361.361 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022246699780225754 2022-03-17 03:51:27,362.362 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 03:51:27,362.362 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'young', 'boy', 'in', '[MASK]', '[MASK]', 'standing', 'in', 'the', 'snow', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 03:51:27,378.378 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['snow', '[UNK]', 'ground', 'jacket', 'glove', 'hood', 'pole', 'ski', 'stripe', 'boot', 'boy', 'tree', 'hat', 'face', 'head', 'leg', 'child', 'mouth', 'sunglasses', 'hand', 'coat', 'nose', 'strap', 'shoe', 'young', 'foot', 'person', 'girl', 'tag', 'little', 'snowy', 'kid', 'ear', 'blue', 'small', 'arm', 'woman', 'cuff', 'stick', 'sock', 'bush', 'country', 'sky', 'vest', 'skier', 'track', 'branch', 'gear', 'skiing', 'slope'] 2022-03-17 03:51:43,396.396 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'hand', 'face', 'young', 'ground', 'blue', 'mouth', 'boy', 'tree', 'leg', 'nose', 'snow', 'hat', 'tag', 'pole', 'jacket', 'hood', 'ski', 'boot', 'glove', 'strap', 'stripe'] 2022-03-17 03:54:06,778.778 2829:trainer.py:487 do_train_dict(): eta: 8:48:21 iter: 48100 speed: 278.2 images/sec total_norm: 147.5848 (149.1245) loss: 143.4307 (144.7334) masked_loss: 1.4259 (1.4422) tag_loss: 141.9932 (143.2913) time: 1.4318 (1.8403) data: 0.0001 (0.0002) to_device: 0.0050 (0.0050) time_gpu: 1.4266 (1.8351) save_time: 8.8805 (16.0781) lr: 0.000028 max mem: 26307 2022-03-17 03:54:07,139.139 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5454545617103577 2022-03-17 03:54:07,139.139 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 132.91464233398438 2022-03-17 03:54:07,139.139 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.3353139077974 2022-03-17 03:54:31,475.475 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022274794057011604 2022-03-17 03:54:31,476.476 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 03:54:31,476.476 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'laptop', 'computer', 'set', 'up', 'on', 'a', 'makeshift', 'cardboard', '[MASK]', 'on', 'a', 'desk', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 03:54:31,491.491 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['laptop', 'table', 'keyboard', 'box', 'desk', 'screen', 'paper', 'computer', '[UNK]', 'pen', 'wall', 'book', 'key', 'button', 'cord', 'chair', 'speaker', 'mouse', 'pencil', 'wire', 'bag', 'cup', 'writing', 'window', 'handle', 'logo', 'stand', 'notebook', 'ear', 'shelf', 'pad', 'container', 'top', 'monitor', 'plug', 'phone', 'marker', 'envelope', 'head', 'tape', 'cable', 'hand', 'light', 'coffee', 'card', 'picture', 'lid', 'remote', 'door', 'arm'] 2022-03-17 03:54:47,468.468 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['cup', 'table', 'wall', 'key', 'stand', 'paper', 'computer', 'box', 'wood', 'screen', 'card', 'desk', 'map', 'handle', 'button', 'wire', 'monitor', 'keyboard', 'holder', 'envelope', 'cord', 'marker', 'laptop', 'pencil', 'makeshift', 'cardboard', 'scissors'] 2022-03-17 03:57:11,188.188 2829:trainer.py:487 do_train_dict(): eta: 8:45:32 iter: 48200 speed: 277.6 images/sec total_norm: 148.8483 (152.8994) loss: 143.9297 (144.0957) masked_loss: 1.5178 (1.5530) tag_loss: 141.9299 (142.5427) time: 1.4320 (1.8442) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4270 (1.8390) save_time: 8.8805 (16.0781) lr: 0.000027 max mem: 26307 2022-03-17 03:57:11,548.548 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5135135054588318 2022-03-17 03:57:11,549.549 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 144.86636352539062 2022-03-17 03:57:11,549.549 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.34213658840267 2022-03-17 03:57:35,708.708 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022399011999368668 2022-03-17 03:57:35,709.709 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 03:57:35,709.709 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'is', 'an', 'elephant', 'statue', 'that', 'is', 'on', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 03:57:35,724.724 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['grass', 'elephant', 'trunk', 'tree', 'foot', 'ear', 'head', 'shadow', 'eye', 'refrigerator', 'ground', 'door', 'fence', 'sign', '[UNK]', 'window', 'leg', 'truck', 'car', 'building', 'booth', 'park', 'sky', 'trailer', 'path', 'letter', 'tire', 'statue', 'person', 'road', 'sidewalk', 'hand', 'box', 'van', 'base', 'walkway', 'shirt', 'machine', 'boy', 'roof', 'shed', 'large', 'chain', 'man', 'gate', 'bush', 'front', 'house', 'top', 'street'] 2022-03-17 03:57:51,623.623 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'road', 'park', 'ground', 'eye', 'foot', 'window', 'tree', 'sky', 'path', 'leg', 'ear', 'truck', 'shadow', 'grass', 'trunk', 'fence', 'booth', 'trailer', 'elephant', 'refrigerator'] 2022-03-17 04:00:15,797.797 2829:trainer.py:487 do_train_dict(): eta: 8:42:43 iter: 48300 speed: 277.3 images/sec total_norm: 146.8591 (151.2959) loss: 141.2348 (142.8129) masked_loss: 1.4665 (1.4911) tag_loss: 139.5633 (141.3218) time: 1.4332 (1.8460) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4281 (1.8410) save_time: 8.8805 (16.0781) lr: 0.000027 max mem: 26307 2022-03-17 04:00:16,159.159 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6857143044471741 2022-03-17 04:00:16,159.159 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 135.21192932128906 2022-03-17 04:00:16,159.159 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.342319496407 2022-03-17 04:00:40,559.559 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02240808866918087 2022-03-17 04:00:40,560.560 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 04:00:40,560.560 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', '[MASK]', 'a', 'skate', '##board', 'doing', 'a', 'trick', 'on', 'a', 'cement', 'wall', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 04:00:40,576.576 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'wall', 'shoe', 'hand', 'painting', 'shirt', 'man', 'leg', 'arm', 'jean', 'hair', 'head', 'wheel', 'map', 'tree', 'board', 'building', 'boy', 'hat', 'sidewalk', 'window', 'face', 'jacket', 'person', 'art', 'glasses', 'picture', 'ramp', 'sign', 'cap', 'branch', 'coat', 'bench', 'short', 'pole', 'ear', 'artwork', 'ground', 'door', 'block', 'watch', 'young', 'sweater', 'woman', 'paint', 'foot', 'step', 'trick', 'sock', 'light'] 2022-03-17 04:00:56,505.505 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'line', 'hair', 'wall', 'arm', 'boy', 'painting', 'leg', 'map', 'wheel', 'hat', 'cap', 'jacket', 'trick', 'reflection', 'sleeve', 'shoe', 'cement'] 2022-03-17 04:03:20,532.532 2829:trainer.py:487 do_train_dict(): eta: 8:39:54 iter: 48400 speed: 277.2 images/sec total_norm: 147.0013 (149.5911) loss: 141.6716 (143.8367) masked_loss: 1.4934 (1.5380) tag_loss: 140.4748 (142.2988) time: 1.4344 (1.8473) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4292 (1.8421) save_time: 8.8805 (16.0781) lr: 0.000027 max mem: 26307 2022-03-17 04:03:20,893.893 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7058823704719543 2022-03-17 04:03:20,893.893 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 127.52845001220703 2022-03-17 04:03:20,894.894 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.35437128912542 2022-03-17 04:03:45,201.201 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022411055862903595 2022-03-17 04:03:45,202.202 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 04:03:45,203.203 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'there', 'is', 'dunes', 'click', 'tower', 'on', 'the', 'front', '[MASK]', 'the', 'building', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 04:03:45,218.218 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['building', 'sky', 'tree', 'window', 'clock', 'tower', '[UNK]', 'wall', 'church', 'dome', 'roof', 'hand', 'top', 'sign', 'spire', 'cross', 'cloud', 'arch', 'tall', 'large', 'light', 'street', 'view', 'pole', 'weather', 'statue', 'yellow', 'background', 'vane', 'old', 'front', 'white', 'wire', 'fence', 'person', 'middle', 'door', 'flag', 'archway', 'circle', 'image', 'city', 'flower', 'leaf', 'ornate', 'side', 'lamp', 'banner', 'bell', 'line'] 2022-03-17 04:04:01,204.204 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'hand', 'building', 'front', 'wall', 'window', 'tree', 'tower', 'sky', 'clock', 'cloud', 'arch', 'dome', 'click'] 2022-03-17 04:06:25,064.064 2829:trainer.py:487 do_train_dict(): eta: 8:37:05 iter: 48500 speed: 277.5 images/sec total_norm: 148.7495 (151.7752) loss: 138.5834 (139.9065) masked_loss: 1.4741 (1.4953) tag_loss: 136.8444 (138.4111) time: 1.4329 (1.8453) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4278 (1.8401) save_time: 8.8805 (16.0781) lr: 0.000027 max mem: 26307 2022-03-17 04:06:25,426.426 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5757575631141663 2022-03-17 04:06:25,426.426 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 153.03562927246094 2022-03-17 04:06:25,426.426 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.35484857912417 2022-03-17 04:06:49,925.925 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022365126758813858 2022-03-17 04:06:49,926.926 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 04:06:49,926.926 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'elephants', 'in', 'tall', 'dry', 'grass', 'next', 'to', '[MASK]', 'pistols', 'of', 'water', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 04:06:49,942.942 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['grass', 'elephant', 'water', 'tree', 'trunk', 'herd', 'field', 'river', 'animal', '[UNK]', 'ear', 'sky', 'bush', 'tail', 'rock', 'head', 'bank', 'plant', 'group', 'body', 'large', 'bird', 'palm', 'green', 'grassy', 'leg', 'horn', 'leaf', 'zebra', 'branch', 'ripple', 'wild', 'wood', 'cow', 'plain', 'shore', 'dirt', 'watering', 'tall', 'cloud', 'buffalo', 'horse', 'stream', 'hole', 'land', 'lake', 'couple', 'reflection', 'hill', 'area'] 2022-03-17 04:07:05,864.864 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'water', 'body', 'field', 'tree', 'sky', 'tall', 'dry', 'bird', 'palm', 'grass', 'bush', 'plain', 'trunk', 'elephant', 'herd'] 2022-03-17 04:09:29,724.724 2829:trainer.py:487 do_train_dict(): eta: 8:34:16 iter: 48600 speed: 277.3 images/sec total_norm: 148.2089 (150.7668) loss: 140.1087 (140.9313) masked_loss: 1.4378 (1.4482) tag_loss: 138.0247 (139.4831) time: 1.4348 (1.8467) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4296 (1.8415) save_time: 8.8805 (16.0781) lr: 0.000027 max mem: 26307 2022-03-17 04:09:30,088.088 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6216216087341309 2022-03-17 04:09:30,088.088 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 146.89089965820312 2022-03-17 04:09:30,088.088 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.35116856788463 2022-03-17 04:09:54,593.593 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022373618558049202 2022-03-17 04:09:54,594.594 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 04:09:54,594.594 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'black', '[MASK]', '279', 'sitting', 'by', 'a', 'fa', '##uce', '##t', 'of', '[MASK]', 'water', 'in', 'a', 'tub', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 04:09:54,609.609 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['cat', 'head', 'ear', 'eye', 'wall', '[UNK]', 'bathroom', 'toilet', 'sink', 'floor', 'black', 'tile', 'nose', 'lid', 'tub', 'water', 'handle', 'drain', 'leg', 'face', 'shower', 'mirror', 'knob', 'paw', 'camera', 'rug', 'bottle', 'animal', 'holder', 'white', 'door', 'tank', 'collar', 'towel', 'paper', 'soap', 'tag', 'cabinet', 'ledge', 'can', 'pipe', 'reflection', 'shadow', 'next', 'mat', 'brush', 'curtain', 'bar', 'bowl', 'person'] 2022-03-17 04:10:10,551.551 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'water', 'black', 'wall', 'eye', 'ear', 'cat', 'handle', 'bathroom', 'sink', 'drain', 'tub'] 2022-03-17 04:12:34,264.264 2829:trainer.py:487 do_train_dict(): eta: 8:31:27 iter: 48700 speed: 277.4 images/sec total_norm: 147.6346 (150.0370) loss: 139.4770 (141.0584) masked_loss: 1.3989 (1.4448) tag_loss: 138.1284 (139.6136) time: 1.4320 (1.8454) data: 0.0001 (0.0005) to_device: 0.0051 (0.0050) time_gpu: 1.4266 (1.8399) save_time: 8.8805 (16.0781) lr: 0.000027 max mem: 26307 2022-03-17 04:12:34,625.625 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6470588445663452 2022-03-17 04:12:34,626.626 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 133.9613037109375 2022-03-17 04:12:34,626.626 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.3578565394292 2022-03-17 04:12:59,166.166 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022435788065195084 2022-03-17 04:12:59,166.166 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 04:12:59,167.167 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', '##raf', '##fe', 'in', 'a', '[MASK]', 'with', 'a', 'person', 'near', 'trees', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 04:12:59,182.182 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'head', 'bush', 'ear', '[UNK]', 'wood', 'forest', 'grass', 'eye', 'neck', 'branch', 'trunk', 'horn', 'nose', 'face', 'ground', 'hair', 'spot', 'man', 'tail', 'leg', 'mane', 'field', 'shirt', 'camera', 'dirt', 'animal', 'mouth', 'brush', 'leaf', 'sky', 'area', 'plant', 'next', 'green', 'group', 'hat', 'log', 'couple', 'hand', 'lush', 'jungle', 'small', 'wooded', 'tall', 'path', 'standing', 'person', 'fence', 'front'] 03-17 04:13:14.951 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-17 04:13:14.951 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 2022-03-17 04:13:15,081.081 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'field', 'ground', 'hair', 'person', 'arm', 'forest', 'eye', 'neck', 'tree', 'wood', 'shirt', 'leg', 'ear', 'grass', 'belt', 'bush', 'horn'] 03-17 04:13:15.772 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 13}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 4}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 13}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 12}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 37}] 2022-03-17 04:15:38,819.819 2829:trainer.py:487 do_train_dict(): eta: 8:28:38 iter: 48800 speed: 277.4 images/sec total_norm: 145.3937 (151.0171) loss: 137.4663 (139.8789) masked_loss: 1.3957 (1.4348) tag_loss: 136.2184 (138.4440) time: 1.4329 (1.8455) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4276 (1.8403) save_time: 8.8805 (16.0781) lr: 0.000027 max mem: 26307 2022-03-17 04:15:39,180.180 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6363636255264282 2022-03-17 04:15:39,180.180 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 112.019775390625 2022-03-17 04:15:39,180.180 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.36997084822391 2022-03-17 04:16:03,580.580 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022514278069138527 2022-03-17 04:16:03,581.581 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 04:16:03,581.581 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'tall', 'os', '##tric', '##h', '[MASK]', '[MASK]', 'a', 'lush', 'green', 'field', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 04:16:03,596.596 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'grass', '[UNK]', 'neck', 'bush', 'bird', 'trunk', 'branch', 'field', 'head', 'leg', 'feather', 'flower', 'tail', 'black', 'ground', 'wing', 'plant', 'forest', 'large', 'sky', 'beak', 'grassy', 'wild', 'log', 'background', 'area', 'tall', 'rock', 'body', 'group', 'wood', 'standing', 'pine', 'next', 'top', 'green', 'palm', 'white', 'couple', 'wooded', 'dirt', 'hill', 'other', 'walking', 'brush', 'bear', 'front', 'date', 'middle'] 2022-03-17 04:16:19,487.487 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'field', 'green', 'neck', 'tree', 'branch', 'leg', 'tall', 'bird', 'grass', 'tail', 'bush', 'flower', 'trunk', 'feather', 'lush'] 2022-03-17 04:18:43,361.361 2829:trainer.py:487 do_train_dict(): eta: 8:25:49 iter: 48900 speed: 277.4 images/sec total_norm: 150.1504 (153.1115) loss: 139.2478 (140.6943) masked_loss: 1.3477 (1.3736) tag_loss: 138.0876 (139.3207) time: 1.4322 (1.8454) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4270 (1.8402) save_time: 8.8805 (16.0781) lr: 0.000026 max mem: 26307 2022-03-17 04:18:43,723.723 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6764705777168274 2022-03-17 04:18:43,723.723 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 144.8594970703125 2022-03-17 04:18:43,723.723 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.37973635148029 2022-03-17 04:19:08,361.361 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022561753168702126 2022-03-17 04:19:08,361.361 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 04:19:08,361.361 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'and', 'woman', 'walking', 'down', 'a', '[MASK]', '[MASK]', 'street', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 04:19:08,377.377 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['jacket', '[UNK]', 'umbrella', 'sidewalk', 'person', 'ground', 'shoe', 'head', 'man', 'rain', 'reflection', 'hood', 'coat', 'leg', 'hand', 'vest', 'arm', 'hair', 'bag', 'line', 'face', 'building', 'woman', 'sign', 'nose', 'light', 'street', 'purse', 'handle', 'boy', 'jean', 'wall', 'scarf', 'suitcase', 'tree', 'pole', 'boot', 'car', 'logo', 'hat', 'stripe', 'strap', 'road', 'zipper', 'wheel', 'leaf', 'mouth', 'window', 'floor', 'child'] 2022-03-17 04:19:24,433.433 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'street', 'woman', 'ground', 'hair', 'person', 'jean', 'shirt', 'bag', 'rain', 'handle', 'jacket', 'leaf', 'purse', 'reflection', 'shoe', 'sidewalk', 'umbrella', 'soaked', 'vest', 'stripe', 'scarf'] 2022-03-17 04:21:47,816.816 2829:trainer.py:487 do_train_dict(): eta: 8:23:00 iter: 49000 speed: 277.6 images/sec total_norm: 149.1608 (151.3031) loss: 144.4656 (144.3309) masked_loss: 1.4613 (1.4955) tag_loss: 143.1575 (142.8353) time: 1.4313 (1.8445) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4258 (1.8393) save_time: 8.8805 (16.0781) lr: 0.000026 max mem: 26307 2022-03-17 04:21:48,179.179 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6764705777168274 2022-03-17 04:21:48,179.179 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 137.80441284179688 2022-03-17 04:21:48,179.179 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.38384796111501 2022-03-17 04:22:12,755.755 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02256636507809162 2022-03-17 04:22:12,755.755 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 04:22:12,756.756 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'road', 'crew', 'of', 'a', 'large', 'city', 'street', 'with', '[MASK]', '[MASK]', 'above', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 04:22:12,771.771 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['man', 'tree', 'vest', 'building', '[UNK]', 'line', 'jacket', 'sky', 'shoe', 'window', 'person', 'curb', 'ground', 'road', 'sidewalk', 'street', 'tire', 'fire', 'truck', 'head', 'sign', 'cart', 'car', 'worker', 'shirt', 'jean', 'hat', 'coat', 'safety', 'dirt', 'door', 'plant', 'house', 'cone', 'light', 'hose', 'pole', 'fence', 'city', 'hair', 'leaf', 'wall', 'wheel', 'construction', 'cap', 'hand', 'yellow', 'vehicle', 'stripe', 'ladder'] 2022-03-17 04:22:28,751.751 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'city', 'man', 'line', 'building', 'large', 'road', 'street', 'car', 'ground', 'person', 'sun', 'window', 'tree', 'crew', 'sign', 'sky', 'jean', 'truck', 'jacket', 'toy', 'shoe', 'tire', 'cone', 'curb', 'suv', 'vest'] 2022-03-17 04:24:52,660.660 2829:trainer.py:487 do_train_dict(): eta: 8:20:10 iter: 49100 speed: 277.0 images/sec total_norm: 147.2410 (149.9541) loss: 139.6843 (139.8121) masked_loss: 1.4377 (1.4726) tag_loss: 137.8044 (138.3395) time: 1.4323 (1.8485) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4270 (1.8433) save_time: 8.8805 (16.0781) lr: 0.000026 max mem: 26307 2022-03-17 04:24:53,020.020 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.529411792755127 2022-03-17 04:24:53,020.020 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 132.60842895507812 2022-03-17 04:24:53,020.020 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.39298798010601 2022-03-17 04:25:17,735.735 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02259107679128647 2022-03-17 04:25:17,735.735 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 04:25:17,735.735 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'woman', 'riding', 'on', 'the', '[MASK]', 'of', 'an', 'elephant', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 04:25:17,751.751 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['building', 'window', 'elephant', 'leg', 'trunk', 'shirt', 'hair', 'ear', 'head', 'eye', 'short', 'sign', 'tree', 'sock', '[UNK]', 'foot', 'man', 'plant', 'door', 'wall', 'palm', 'shoe', 'person', 'woman', 'ground', 'fence', 'doorway', 'leaf', 'mouth', 'girl', 'archway', 'statue', 'jean', 'animal', 'hand', 'chair', 'arm', 'boot', 'tail', 'bush', 'sidewalk', 'belt', 'face', 'logo', 'poster', 'branch', 'pole', 'arch', 'bag', 'railing'] 2022-03-17 04:25:33,692.692 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'back', 'head', 'building', 'door', 'woman', 'short', 'hair', 'girl', 'mouth', 'table', 'eye', 'chair', 'plant', 'foot', 'window', 'tree', 'sign', 'shirt', 'platform', 'leg', 'ear', 'palm', 'tail', 'statue', 'logo', 'trunk', 'boot', 'elephant', 'shoe', 'poster', 'advertisement', 'patio', 'archway', 'sock'] 2022-03-17 04:27:57,500.500 2829:trainer.py:487 do_train_dict(): eta: 8:17:21 iter: 49200 speed: 277.0 images/sec total_norm: 146.8042 (150.1229) loss: 137.9583 (140.2049) masked_loss: 1.3763 (1.4622) tag_loss: 136.1676 (138.7428) time: 1.4323 (1.8484) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4272 (1.8433) save_time: 8.8805 (16.0781) lr: 0.000026 max mem: 26307 2022-03-17 04:27:57,861.861 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7272727489471436 2022-03-17 04:27:57,861.861 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 140.09893798828125 2022-03-17 04:27:57,862.862 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.39954369256753 2022-03-17 04:28:22,737.737 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022620538249611855 2022-03-17 04:28:22,737.737 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 04:28:22,737.737 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'very', 'nicely', 'dressed', 'man', 'standing', 'by', '[MASK]', 'door', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 04:28:22,753.753 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['floor', 'jacket', 'shirt', '[UNK]', 'tie', 'shoe', 'suit', 'wall', 'door', 'belt', 'man', 'face', 'hair', 'tag', 'leg', 'collar', 'room', 'arm', 'neck', 'head', 'name', 'hand', 'clothes', 'mouth', 'foot', 'nose', 'glasses', 'knot', 'ear', 'coat', 'eye', 'window', 'ceiling', 'book', 'beard', 'buckle', 'wheel', 'bag', 'outlet', 'table', 'cord', 'picture', 'sock', 'chair', 'paper', 'frame', 'boot', 'hat', 'curtain', 'front'] 2022-03-17 04:28:38,638.638 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'name', 'hand', 'face', 'room', 'door', 'hair', 'floor', 'wall', 'arm', 'eye', 'paper', 'neck', 'foot', 'window', 'sign', 'shirt', 'leg', 'clothes', 'nose', 'ear', 'suit', 'pocket', 'tie', 'belt', 'blind', 'tag', 'jacket', 'collar', 'boot', 'beard', 'shoe', 'poster', 'knob'] 2022-03-17 04:31:02,206.206 2829:trainer.py:487 do_train_dict(): eta: 8:14:32 iter: 49300 speed: 277.2 images/sec total_norm: 147.5067 (149.0720) loss: 137.8561 (137.0620) masked_loss: 1.3613 (1.3977) tag_loss: 135.8996 (135.6643) time: 1.4309 (1.8471) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4256 (1.8419) save_time: 8.8805 (16.0781) lr: 0.000026 max mem: 26307 2022-03-17 04:31:02,566.566 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7647058963775635 2022-03-17 04:31:02,567.567 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 168.91876220703125 2022-03-17 04:31:02,567.567 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.3919389161021 2022-03-17 04:31:27,455.455 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02261926420032978 2022-03-17 04:31:27,456.456 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 04:31:27,456.456 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'pair', '##হ', '[MASK]', 'playing', 'in', 'pool', 'at', 'zoo', 'environment', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 04:31:27,471.471 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['bear', 'water', 'ear', 'nose', 'head', 'eye', 'face', 'fur', 'mouth', 'leg', 'snout', 'brown', 'large', 'rock', 'back', 'tree', 'tongue', '[UNK]', 'hair', 'animal', 'grass', 'sky', 'foot', 'black', 'reflection', 'neck', 'dog', 'paw', 'river', 'wave', 'ripple', 'pool', 'teeth', 'furry', 'big', 'log', 'snow', 'background', 'mountain', 'pole', 'wall', 'light', 'polar', 'swimming', 'horn', 'plant', 'small', 'fish', 'long', 'next'] 2022-03-17 04:31:43,374.374 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'water', 'mouth', 'chest', 'eye', 'neck', 'dog', 'environment', 'teeth', 'animal', 'leg', 'tongue', 'nose', 'ear', 'bear', 'pool', 'zoo'] 2022-03-17 04:34:07,220.220 2829:trainer.py:487 do_train_dict(): eta: 8:11:42 iter: 49400 speed: 276.7 images/sec total_norm: 150.4810 (151.2351) loss: 141.0749 (140.9587) masked_loss: 1.3648 (1.3966) tag_loss: 139.6023 (139.5621) time: 1.4334 (1.8501) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4283 (1.8450) save_time: 8.8805 (16.0781) lr: 0.000026 max mem: 26307 2022-03-17 04:34:07,583.583 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6363636255264282 2022-03-17 04:34:07,583.583 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 170.29539489746094 2022-03-17 04:34:07,583.583 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.38903579711913 2022-03-17 04:34:32,542.542 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02261706255376339 2022-03-17 04:34:32,543.543 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 04:34:32,543.543 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'golden', 'retrieve', '##r', 'and', 'a', 'pit', 'bull', 'sitting', '[MASK]', 'the', 'back', 'of', '[unused465]', 'pickup', 'truck', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 04:34:32,558.558 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['car', 'tree', 'light', 'dog', 'building', 'sky', 'ear', 'street', 'sign', 'mirror', '[UNK]', 'pole', 'collar', 'nose', 'road', 'eye', 'sidewalk', 'background', 'window', 'head', 'leg', 'tire', 'tail', 'motorcycle', 'windshield', 'traffic', 'seat', 'truck', 'license', 'plate', 'line', 'paw', 'curb', 'wheel', 'grill', 'bike', 'vehicle', 'back', 'city', 'suv', 'trunk', 'ground', 'wall', 'door', 'bumper', 'man', 'person', 'harness', 'next', 'handle'] 2022-03-17 04:34:48,426.426 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'can', 'back', 'head', 'line', 'building', 'door', 'road', 'street', 'light', 'car', 'fire', 'post', 'window', 'tree', 'golden', 'sign', 'sky', 'dog', 'traffic', 'nose', 'ear', 'truck', 'plate', 'mirror', 'license', 'pole', 'flower', 'pit', 'bull', 'trunk', 'collar', 'lamp', 'trash', 'sidewalk', 'pickup', 'suv', 'grill', 'windshield'] 2022-03-17 04:37:12,111.111 2829:trainer.py:487 do_train_dict(): eta: 8:08:53 iter: 49500 speed: 276.9 images/sec total_norm: 147.0919 (148.2388) loss: 143.1318 (143.2327) masked_loss: 1.4506 (1.5007) tag_loss: 141.5413 (141.7319) time: 1.4313 (1.8489) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4263 (1.8438) save_time: 8.8805 (16.0781) lr: 0.000025 max mem: 26307 2022-03-17 04:37:12,475.475 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7297297120094299 2022-03-17 04:37:12,475.475 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 137.5308837890625 2022-03-17 04:37:12,475.475 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.39715541562727 2022-03-17 04:37:37,496.496 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02263442426919937 2022-03-17 04:37:37,497.497 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 04:37:37,497.497 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'an', '[MASK]', 'casts', 'a', 'shadow', 'on', 'four', 'chairs', 'on', 'a', 'beach', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 04:37:37,514.514 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['beach', 'chair', 'sand', 'umbrella', 'water', 'person', 'sky', 'ocean', 'wave', 'shadow', 'pole', 'table', 'cloud', 'lounge', 'palm', '[UNK]', 'mountain', 'shore', 'boat', 'towel', 'cushion', 'sandy', 'tree', 'rock', 'lawn', 'horizon', 'bird', 'dog', 'bench', 'footprint', 'roof', 'patio', 'top', 'man', 'hill', 'hut', 'view', 'resort', 'couple', 'many', 'sun', 'building', 'background', 'large', 'canopy', 'straw', 'sunny', 'empty', 'ground', 'wall'] 2022-03-17 04:37:53,433.433 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['water', 'person', 'table', 'chair', 'beach', 'sky', 'ocean', 'wave', 'bird', 'shadow', 'sand', 'cloud', 'pole', 'lounge', 'umbrella'] 2022-03-17 04:40:17,174.174 2829:trainer.py:487 do_train_dict(): eta: 8:06:03 iter: 49600 speed: 276.7 images/sec total_norm: 149.4679 (150.0831) loss: 138.9617 (140.1336) masked_loss: 1.3869 (1.4121) tag_loss: 137.5309 (138.7215) time: 1.4320 (1.8506) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4267 (1.8455) save_time: 8.8805 (16.0781) lr: 0.000025 max mem: 26307 2022-03-17 04:40:17,530.530 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6756756901741028 2022-03-17 04:40:17,531.531 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 158.37367248535156 2022-03-17 04:40:17,531.531 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.39898772978447 2022-03-17 04:40:42,473.473 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022680258378386497 2022-03-17 04:40:42,473.473 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 04:40:42,474.474 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'cooked', 'dish', 'of', 'some', 'kind', 'sitting', 'on', 'a', '[MASK]', 'with', 'two', 'bowls', 'of', 'fresh', 'vegetables', 'next', 'to', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 04:40:42,489.489 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['bowl', 'table', 'carrot', '[UNK]', 'food', 'plate', 'tomato', 'salad', 'bowls', 'design', 'dish', 'pepper', 'stem', 'vegetable', 'strawberry', 'spoon', 'orange', 'fork', 'meat', 'rice', 'fruit', 'flower', 'container', 'onion', 'cheese', 'chicken', 'different', 'bean', 'handle', 'glass', 'potato', 'other', 'lemon', 'cloth', 'white', 'wooden', 'bread', 'mushroom', 'leaf', 'shrimp', 'slice', 'pasta', 'full', 'next', 'napkin', 'sauce', 'meal', 'corn', 'olive', 'egg'] 2022-03-17 04:40:58,496.496 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'kind', 'table', 'food', 'bowl', 'fresh', 'stem', 'dish', 'lid', 'cooked', 'carrot'] 03-17 04:43:15.869 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-17 04:43:15.869 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-17 04:43:16.916 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-17 04:43:22,100.100 2829:trainer.py:487 do_train_dict(): eta: 8:03:14 iter: 49700 speed: 276.9 images/sec total_norm: 148.5189 (151.0028) loss: 140.0317 (140.8793) masked_loss: 1.4249 (1.4468) tag_loss: 138.8981 (139.4326) time: 1.4312 (1.8493) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4263 (1.8442) save_time: 8.8805 (16.0781) lr: 0.000025 max mem: 26307 2022-03-17 04:43:22,462.462 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6666666865348816 2022-03-17 04:43:22,462.462 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 103.23800659179688 2022-03-17 04:43:22,462.462 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.41179775712959 2022-03-17 04:43:47,255.255 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022702042013406754 2022-03-17 04:43:47,255.255 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 04:43:47,256.256 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'this', '[MASK]', '[MASK]', 'picture', 'if', 'elephant', 'swimming', 'in', 'the', 'lake', 'waters', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 04:43:47,271.271 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['elephant', 'ear', 'trunk', 'leg', 'tree', 'water', 'grass', 'head', '[UNK]', 'forest', 'mouth', 'eye', 'bush', 'body', 'branch', 'tail', 'wood', 'foot', 'walking', 'river', 'large', 'plant', 'leaf', 'standing', 'face', 'next', 'back', 'animal', 'shore', 'ripple', 'hair', 'flower', 'area', 'tongue', 'field', 'bird', 'young', 'big', 'couple', 'shirt', 'reflection', 'hand', 'splash', 'tall', 'wild', 'rock', 'fish', 'bank', 'background', 'short'] 2022-03-17 04:44:03,223.223 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'water', 'mouth', 'lake', 'forest', 'eye', 'tree', 'picture', 'leg', 'ear', 'grass', 'tail', 'swimming', 'bush', 'trunk', 'elephant'] 2022-03-17 04:46:27,024.024 2829:trainer.py:487 do_train_dict(): eta: 8:00:24 iter: 49800 speed: 276.9 images/sec total_norm: 149.1315 (154.4250) loss: 139.0368 (140.5827) masked_loss: 1.4775 (1.4722) tag_loss: 137.3105 (139.1106) time: 1.4321 (1.8489) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4268 (1.8437) save_time: 8.8805 (16.0781) lr: 0.000025 max mem: 26307 2022-03-17 04:46:27,385.385 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5151515007019043 2022-03-17 04:46:27,385.385 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 138.45236206054688 2022-03-17 04:46:27,385.385 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.41740675106315 2022-03-17 04:46:52,360.360 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02271447703242302 2022-03-17 04:46:52,360.360 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 04:46:52,360.360 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'is', 'holding', '[MASK]', 'hand', 'up', 'by', 'the', 'stop', 'sign', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 04:46:52,376.376 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sign', 'sky', 'letter', 'pole', 'stop', 'street', 'bolt', 'screw', 'tree', '[UNK]', 'wire', 'line', 'bracket', 'arrow', 'post', 'word', 'band', 'red', 'branch', 'top', 'number', 'green', 'intersection', 'antenna', 'front', 'blue', 'writing', 'string', 'strap', 'road', 'close', 'building', 'white', 'background', 'logo', 'light', 'border', 'shadow', 'head', 'cloud', 'stripe', 'back', 'next', 'ring', 'design', 'face', 'wall', 'language', 'metal', 'leaf'] 2022-03-17 04:47:08,344.344 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'hand', 'stop', 'ring', 'letter', 'sign', 'sky', 'finger', 'shadow', 'palm', 'pole', 'thumb', 'arrow', 'bolt', 'screw'] 2022-03-17 04:49:32,159.159 2829:trainer.py:487 do_train_dict(): eta: 7:57:35 iter: 49900 speed: 276.6 images/sec total_norm: 150.7205 (151.1491) loss: 140.5784 (142.9568) masked_loss: 1.4407 (1.4690) tag_loss: 138.7918 (141.4877) time: 1.4319 (1.8517) data: 0.0001 (0.0005) to_device: 0.0051 (0.0050) time_gpu: 1.4266 (1.8462) save_time: 8.8805 (16.0781) lr: 0.000025 max mem: 26307 2022-03-17 04:49:32,524.524 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6285714507102966 2022-03-17 04:49:32,524.524 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 101.52361297607422 2022-03-17 04:49:32,525.525 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.42669495391846 2022-03-17 04:49:57,553.553 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022757606580853462 2022-03-17 04:49:57,553.553 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 04:49:57,553.553 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'long', '##tidae', 'train', 'pulling', 'into', '[MASK]', 'station', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 04:49:57,569.569 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['train', 'track', 'window', 'sky', 'roof', 'front', 'station', '[UNK]', 'pole', 'platform', 'car', 'windshield', 'mountain', 'light', 'engine', 'building', 'sign', 'stripe', 'number', 'person', 'box', 'hill', 'door', 'tree', 'line', 'red', 'bumper', 'man', 'beam', 'wall', 'post', 'wire', 'grass', 'ground', 'sidewalk', 'passenger', 'pillar', 'bush', 'long', 'large', 'next', 'letter', 'shelter', 'gravel', 'bench', 'tower', 'traffic', 'fence', 'logo', 'white'] 2022-03-17 04:50:13,525.525 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'long', 'number', 'line', 'station', 'building', 'front', 'red', 'light', 'car', 'track', 'person', 'wall', 'hill', 'mountain', 'engine', 'window', 'train', 'sign', 'sky', 'platform', 'roof', 'pole', 'rack', 'stripe'] 2022-03-17 04:52:37,494.494 2829:trainer.py:487 do_train_dict(): eta: 7:54:45 iter: 50000 speed: 276.3 images/sec total_norm: 149.1054 (150.0296) loss: 138.9571 (140.2438) masked_loss: 1.3931 (1.4389) tag_loss: 137.3763 (138.8049) time: 1.4329 (1.8534) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4277 (1.8482) save_time: 8.8805 (16.0781) lr: 0.000025 max mem: 26307 2022-03-17 04:52:37,496.496 2829:checkpoint.py:222 save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt 2022-03-17 04:52:46,593.593 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5555555820465088 2022-03-17 04:52:46,593.593 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 133.10516357421875 2022-03-17 04:52:46,593.593 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.43397687200063 2022-03-17 04:53:11,901.901 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02277860790491104 2022-03-17 04:53:11,902.902 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 04:53:11,902.902 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'small', 'teddy', 'bear', '[MASK]', 'in', 'the', 'fore', '##ground', '[MASK]', 'people', 'walk', 'down', 'a', 'sidewalk', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 04:53:11,917.917 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['building', 'hair', 'shirt', 'window', '[UNK]', 'woman', 'person', 'sidewalk', 'fence', 'jacket', 'eye', 'tree', 'man', 'hand', 'head', 'arm', 'mouth', 'nose', 'bag', 'railing', 'sky', 'ear', 'purse', 'face', 'backpack', 'street', 'wall', 'sign', 'strap', 'sleeve', 'food', 'lady', 'sweater', 'tongue', 'paper', 'finger', 'neck', 'pole', 'glasses', 'girl', 'road', 'light', 'car', 'bear', 'shoulder', 'hat', 'dog', 'roof', 'animal', 'boy'] 2022-03-17 04:53:27,798.798 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'face', 'small', 'line', 'building', 'road', 'street', 'woman', 'hair', 'person', 'arm', 'eye', 'window', 'tree', 'sign', 'sky', 'shirt', 'bus', 'animal', 'nose', 'bag', 'bear', 'hat', 'cap', 'pole', 'fence', 'purse', 'teddy', 'sidewalk', 'sweater', 'railing'] 2022-03-17 04:55:50,701.701 2829:trainer.py:487 do_train_dict(): eta: 7:51:58 iter: 50100 speed: 265.0 images/sec total_norm: 148.4082 (149.6757) loss: 139.4005 (141.2265) masked_loss: 1.4070 (1.4382) tag_loss: 137.9250 (139.7883) time: 1.4330 (1.9320) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4276 (1.8396) save_time: 8.8421 (15.3432) lr: 0.000025 max mem: 26307 2022-03-17 04:55:51,062.062 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5428571701049805 2022-03-17 04:55:51,063.063 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 129.89891052246094 2022-03-17 04:55:51,063.063 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.44018154220277 2022-03-17 04:56:15,986.986 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02279139868915081 2022-03-17 04:56:15,987.987 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 04:56:15,987.987 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'young', '[MASK]', 'staring', '[MASK]', 'the', 'camera', 'sticking', 'his', 'nose', 'in', 'between', 'the', 'handles', 'of', 'a', '##don', 'of', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 04:56:16,003.003 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['scissors', 'eye', 'face', 'handle', 'ear', 'eyebrow', '[UNK]', 'hand', 'screw', 'person', 'nose', 'lip', 'finger', 'hair', 'boy', 'nail', 'cheek', 'mouth', 'head', 'arm', 'bolt', 'glasses', 'man', 'blade', 'woman', 'pair', 'wall', 'forehead', 'shadow', 'girl', 'piece', 'thumb', 'red', 'string', 'background', 'shoulder', 'knot', 'shirt', 'hole', 'button', 'hat', 'design', 'young', 'reflection', 'close', 'brush', 'neck', 'metal', 'star', 'child'] 2022-03-17 04:56:31,885.885 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'old', 'face', 'young', 'hair', 'mouth', 'arm', 'boy', 'eye', 'metal', 'piece', 'pair', 'finger', 'nose', 'ear', 'camera', 'handle', 'cheek', 'shadow', 'lip', 'forehead', 'eyebrow', 'screw', 'scissors'] 2022-03-17 04:58:56,230.230 2829:trainer.py:487 do_train_dict(): eta: 7:49:08 iter: 50200 speed: 276.0 images/sec total_norm: 147.6705 (150.4382) loss: 141.8247 (142.1822) masked_loss: 1.4507 (1.4878) tag_loss: 140.2529 (140.6944) time: 1.4336 (1.8553) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4284 (1.8501) save_time: 8.8421 (15.3432) lr: 0.000024 max mem: 26307 2022-03-17 04:58:56,592.592 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7222222089767456 2022-03-17 04:58:56,593.593 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 130.3173828125 2022-03-17 04:58:56,593.593 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.45479534444942 2022-03-17 04:59:22,049.049 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022828172892332077 2022-03-17 04:59:22,050.050 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 04:59:22,050.050 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'dressed', 'in', 'business', 'casual', '[MASK]', 'is', 'alone', 'in', 'a', 'inflated', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 04:59:22,065.065 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tie', 'face', 'wall', 'man', 'shirt', 'glasses', 'nose', 'ceiling', 'hair', 'mouth', 'head', 'ear', 'eye', '[UNK]', 'arm', 'room', 'hand', 'light', 'window', 'floor', 'sleeve', 'table', 'knot', 'collar', 'jacket', 'belt', 'chair', 'neck', 'door', 'blind', 'picture', 'finger', 'suit', 'pillow', 'black', 'couch', 'switch', 'cabinet', 'young', 'lamp', 'camera', 'white', 'box', 'person', 'mirror', 'wrist', 'bag', 'bottle', 'board', 'drawer'] 2022-03-17 04:59:38,052.052 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'hand', 'face', 'room', 'light', 'business', 'board', 'hair', 'mouth', 'floor', 'wall', 'arm', 'eye', 'neck', 'window', 'box', 'jean', 'shirt', 'nose', 'tie', 'ceiling', 'glasses', 'casual', 'sleeve', 'cord', 'attire'] 2022-03-17 05:02:01,552.552 2829:trainer.py:487 do_train_dict(): eta: 7:46:18 iter: 50300 speed: 276.3 images/sec total_norm: 146.8121 (147.7029) loss: 136.5942 (139.1961) masked_loss: 1.4032 (1.4419) tag_loss: 135.4883 (137.7542) time: 1.4329 (1.8532) data: 0.0001 (0.0001) to_device: 0.0051 (0.0051) time_gpu: 1.4276 (1.8480) save_time: 8.8421 (15.3432) lr: 0.000024 max mem: 26307 2022-03-17 05:02:01,913.913 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.53125 2022-03-17 05:02:01,913.913 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 149.7757568359375 2022-03-17 05:02:01,913.913 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.45914998887078 2022-03-17 05:02:27,186.186 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022834159433841705 2022-03-17 05:02:27,187.187 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 05:02:27,187.187 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'young', 'boys', 'are', 'preparing', 'for', 'a', '[MASK]', 'at', 'a', '[MASK]', 'game', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 05:02:27,202.202 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['jersey', 'fence', 'helmet', 'player', 'glove', 'shoe', 'person', '[UNK]', 'number', 'shirt', 'ground', 'leg', 'man', 'arm', 'hand', 'belt', 'field', 'uniform', 'baseball', 'head', 'pole', 'dirt', 'sock', 'back', 'bat', 'batter', 'photo', 'spectator', 'line', 'hat', 'game', 'boy', 'plate', 'ball', 'home', 'cap', 'knee', 'background', 'umpire', 'logo', 'name', 'catcher', 'foot', 'grass', 'white', 'pad', 'sign', 'base', 'young', 'watch'] 2022-03-17 05:02:43,198.198 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'back', 'head', 'man', 'hand', 'number', 'game', 'play', 'young', 'player', 'field', 'ground', 'person', 'boy', 'baseball', 'ball', 'shirt', 'jersey', 'leg', 'belt', 'hat', 'cap', 'uniform', 'pole', 'dirt', 'fence', 'helmet', 'shoe', 'glove', 'batter', 'sock'] 2022-03-17 05:05:06,648.648 2829:trainer.py:487 do_train_dict(): eta: 7:43:28 iter: 50400 speed: 276.6 images/sec total_norm: 149.3395 (150.5801) loss: 135.7944 (138.4608) masked_loss: 1.4548 (1.4752) tag_loss: 134.3376 (136.9856) time: 1.4320 (1.8510) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4269 (1.8458) save_time: 8.8421 (15.3432) lr: 0.000024 max mem: 26307 2022-03-17 05:05:07,011.011 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5428571701049805 2022-03-17 05:05:07,011.011 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 130.78053283691406 2022-03-17 05:05:07,011.011 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.4602456347777 2022-03-17 05:05:32,250.250 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022814041003584862 2022-03-17 05:05:32,250.250 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 05:05:32,250.250 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'group', 'of', 'surf', '##boards', 'are', '[MASK]', 'up', 'on', 'the', 'beach', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 05:05:32,266.266 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'beach', 'person', 'cloud', 'sand', 'woman', 'building', 'man', 'bikini', 'child', 'girl', '[UNK]', 'boat', 'shirt', 'stripe', 'short', 'boy', 'water', 'hair', 'suit', 'bathing', 'shadow', 'bottom', 'chair', 'ocean', 'board', 'fin', 'top', 'towel', 'hand', 'hat', 'wave', 'blue', 'umbrella', 'surf', 'dress', 'house', 'flag', 'group', 'writing', 'head', 'window', 'logo', 'reflection', 'foot', 'family', 'sandy', 'name', 'couple', 'dog'] 2022-03-17 05:05:48,238.238 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'family', 'man', 'group', 'water', 'building', 'top', 'woman', 'short', 'board', 'hair', 'girl', 'person', 'child', 'boy', 'writing', 'beach', 'sign', 'sky', 'shirt', 'dress', 'suit', 'tank', 'flag', 'hat', 'cloud', 'cap', 'pole', 'logo', 'tent', 'banner', 'towel', 'fin', 'bathing', 'bikini'] 2022-03-17 05:08:12,113.113 2829:trainer.py:487 do_train_dict(): eta: 7:40:38 iter: 50500 speed: 276.1 images/sec total_norm: 149.1124 (150.4144) loss: 137.7408 (138.6420) masked_loss: 1.4749 (1.4566) tag_loss: 136.2069 (137.1855) time: 1.4326 (1.8547) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4274 (1.8494) save_time: 8.8421 (15.3432) lr: 0.000024 max mem: 26307 2022-03-17 05:08:12,474.474 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7058823704719543 2022-03-17 05:08:12,474.474 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 98.47399139404297 2022-03-17 05:08:12,474.474 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.4736145155232 2022-03-17 05:08:37,994.994 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02281448245048523 2022-03-17 05:08:37,994.994 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 05:08:37,995.995 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'boys', 'are', 'running', 'after', '[MASK]', 'soccer', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 05:08:38,010.010 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['grass', 'hand', 'short', 'shirt', 'ball', 'boy', 'sock', 'field', 'soccer', 'hair', 'shoe', 'logo', 'head', 'leg', 'jersey', 'shadow', 'arm', 'flower', 'young', 'uniform', 'face', 'man', 'background', 'stripe', '[UNK]', 'fence', 'ground', 'mouth', 'person', 'sleeve', 'knee', 'nose', 'child', 'pole', 'eye', 'bush', 'ear', 'tree', 'player', 'vest', 'girl', 'foot', 'game', 'collar', 'jacket', 'green', 'car', 'number', 'star', 'design'] 2022-03-17 05:08:53,892.892 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'short', 'field', 'hair', 'boy', 'eye', 'ball', 'shirt', 'jersey', 'nose', 'soccer', 'shadow', 'grass', 'flower', 'logo', 'boot', 'shoe', 'sunglasses', 'sock'] 2022-03-17 05:11:17,689.689 2829:trainer.py:487 do_train_dict(): eta: 7:37:48 iter: 50600 speed: 275.9 images/sec total_norm: 150.5359 (151.9252) loss: 140.0559 (140.1738) masked_loss: 1.4527 (1.4399) tag_loss: 138.5333 (138.7340) time: 1.4320 (1.8557) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4270 (1.8507) save_time: 8.8421 (15.3432) lr: 0.000024 max mem: 26307 2022-03-17 05:11:18,050.050 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5833333134651184 2022-03-17 05:11:18,050.050 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 142.77249145507812 2022-03-17 05:11:18,050.050 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.47939107168826 2022-03-17 05:11:43,495.495 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022816665470600128 2022-03-17 05:11:43,495.495 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 05:11:43,496.496 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'close', 'up', 'of', 'a', 'clock', '[MASK]', '[MASK]', 'a', 'steep', '##le', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 05:11:43,511.511 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['clock', 'hand', 'tower', 'sky', 'building', 'window', 'cross', 'cloud', '[UNK]', 'roof', 'face', 'top', 'wall', 'large', 'blue', 'tall', 'big', 'weather', 'number', 'design', 'vane', 'hour', 'arch', 'spire', 'day', 'cloudy', 'ornate', 'white', 'side', 'gold', 'high', 'base', 'star', 'red', 'tree', 'column', 'city', 'middle', 'picture', 'finger', 'pole', 'bottom', 'wire', 'roman', 'beautiful', 'green', 'brick', 'front', 'view', 'pillar'] 2022-03-17 05:11:59,379.379 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'hand', 'face', 'building', 'close', 'cross', 'window', 'tower', 'sky', 'roof', 'clock'] 03-17 05:13:16.976 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-17 05:13:16.976 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-17 05:13:18.044 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 94}] 2022-03-17 05:14:23,125.125 2829:trainer.py:487 do_train_dict(): eta: 7:34:58 iter: 50700 speed: 276.1 images/sec total_norm: 147.4799 (149.7552) loss: 141.0554 (140.8234) masked_loss: 1.4834 (1.5086) tag_loss: 139.4084 (139.3148) time: 1.4326 (1.8543) data: 0.0001 (0.0002) to_device: 0.0050 (0.0050) time_gpu: 1.4274 (1.8492) save_time: 8.8421 (15.3432) lr: 0.000024 max mem: 26307 2022-03-17 05:14:23,487.487 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.529411792755127 2022-03-17 05:14:23,487.487 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 132.40927124023438 2022-03-17 05:14:23,487.487 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.48357393610196 2022-03-17 05:14:49,193.193 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02281193435192108 2022-03-17 05:14:49,193.193 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 05:14:49,193.193 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'blue', 'beach', 'vernacular', 'under', 'a', '[MASK]', 'umbrella', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 05:14:49,209.209 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'water', 'beach', 'ocean', 'sand', 'umbrella', 'wave', 'pole', 'chair', 'cloud', 'boat', 'shore', 'yellow', 'leg', 'shadow', '[UNK]', 'footprint', 'lawn', 'back', 'top', 'sandy', 'blue', 'towel', 'rock', 'mountain', 'day', 'person', 'cushion', 'lounge', 'empty', 'table', 'body', 'colorful', 'sun', 'area', 'next', 'sunny', 'arm', 'scene', 'bucket', 'handle', 'board', 'front', 'grass', 'couple', 'post', 'seat', 'large', 'background', 'red'] 2022-03-17 05:15:05,034.034 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'water', 'top', 'rock', 'blue', 'chair', 'beach', 'sky', 'yellow', 'ocean', 'leg', 'wave', 'shore', 'sand', 'cloud', 'pole', 'umbrella'] 2022-03-17 05:17:28,742.742 2829:trainer.py:487 do_train_dict(): eta: 7:32:08 iter: 50800 speed: 275.8 images/sec total_norm: 147.6092 (152.0821) loss: 141.1256 (140.7496) masked_loss: 1.4150 (1.3789) tag_loss: 139.6001 (139.3707) time: 1.4323 (1.8562) data: 0.0001 (0.0002) to_device: 0.0052 (0.0051) time_gpu: 1.4271 (1.8510) save_time: 8.8421 (15.3432) lr: 0.000024 max mem: 26307 2022-03-17 05:17:29,104.104 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7878788113594055 2022-03-17 05:17:29,104.104 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 154.2214813232422 2022-03-17 05:17:29,104.104 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.47611718468207 2022-03-17 05:17:54,736.736 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022800426930189133 2022-03-17 05:17:54,736.736 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 05:17:54,736.736 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'group', 'of', 'people', 'posing', 'for', '[MASK]', 'photograph', 'at', '[MASK]', 'black', 'tie', 'event', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 05:17:54,752.752 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['woman', 'hat', 'hand', 'man', 'hair', 'shirt', 'tree', 'person', 'necklace', 'head', '[UNK]', 'tie', 'girl', 'dress', 'sunglasses', 'short', 'boy', 'face', 'ground', 'group', 'suit', 'shoe', 'glasses', 'camera', 'baby', 'bracelet', 'bag', 'phone', 'bottle', 'purse', 'flower', 'child', 'helmet', 'watch', 'leg', 'sky', 'young', 'jacket', 'boot', 'top', 'sock', 'nose', 'other', 'fence', 'glass', 'eye', 'skirt', 'ring', 'cap', 'bench'] 2022-03-17 05:18:10,722.722 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'group', 'hand', 'face', 'black', 'woman', 'cup', 'short', 'hair', 'girl', 'person', 'event', 'boy', 'glass', 'baby', 'tree', 'watch', 'shirt', 'picture', 'dress', 'suit', 'tie', 'bottle', 'hat', 'photograph', 'shoe', 'necklace', 'sunglasses', 'groom'] 2022-03-17 05:20:34,279.279 2829:trainer.py:487 do_train_dict(): eta: 7:29:18 iter: 50900 speed: 276.0 images/sec total_norm: 149.1933 (153.1358) loss: 139.8029 (142.4593) masked_loss: 1.4123 (1.4768) tag_loss: 137.6314 (140.9825) time: 1.4326 (1.8554) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4278 (1.8502) save_time: 8.8421 (15.3432) lr: 0.000023 max mem: 26307 2022-03-17 05:20:34,641.641 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5 2022-03-17 05:20:34,641.641 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 155.60215759277344 2022-03-17 05:20:34,642.642 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.47513413522758 2022-03-17 05:21:00,160.160 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022831393405795097 2022-03-17 05:21:00,161.161 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 05:21:00,161.161 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'some', 'bicycle', 'riders', 'are', '[MASK]', 'a', 'street', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 05:21:00,177.177 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['line', 'bicycle', 'road', 'bike', 'street', 'man', 'shirt', 'person', 'pole', 'tree', 'sidewalk', 'wheel', 'sky', '[UNK]', 'building', 'sign', 'tire', 'window', 'short', 'helmet', 'light', 'jacket', 'woman', 'shoe', 'car', 'curb', 'hand', 'backpack', 'head', 'hair', 'traffic', 'house', 'jean', 'fence', 'city', 'bag', 'wall', 'stop', 'boy', 'roof', 'grass', 'bush', 'truck', 'fire', 'arrow', 'cloud', 'bus', 'leg', 'arm', 'vest'] 2022-03-17 05:21:16,092.092 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'house', 'line', 'building', 'road', 'street', 'light', 'car', 'person', 'tree', 'sky', 'shirt', 'traffic', 'roof', 'wheel', 'grass', 'bush', 'pole', 'jacket', 'bike', 'fence', 'bicycle', 'helmet', 'sidewalk', 'tire', 'backpack', 'curb', 'chimney', 'vest', 'hedge', 'biker'] 2022-03-17 05:23:39,930.930 2829:trainer.py:487 do_train_dict(): eta: 7:26:28 iter: 51000 speed: 275.8 images/sec total_norm: 148.4104 (150.0822) loss: 138.0547 (139.5933) masked_loss: 1.4620 (1.5098) tag_loss: 136.9248 (138.0835) time: 1.4325 (1.8565) data: 0.0001 (0.0005) to_device: 0.0050 (0.0049) time_gpu: 1.4273 (1.8511) save_time: 8.8421 (15.3432) lr: 0.000023 max mem: 26307 2022-03-17 05:23:40,291.291 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5555555820465088 2022-03-17 05:23:40,291.291 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 154.33363342285156 2022-03-17 05:23:40,291.291 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.47277390840236 2022-03-17 05:24:05,973.973 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022862110286951065 2022-03-17 05:24:05,974.974 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 05:24:05,974.974 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'zoo', 'enclosure', 'containing', '[MASK]', '##raf', '##fe', '##s', 'and', 'os', '##tric', '##hes', '[MASK]', 'the', '[MASK]', 'exhibit', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 05:24:05,989.989 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'grass', 'neck', 'fence', '[UNK]', 'rock', 'tail', 'shadow', 'leg', 'head', 'sky', 'bird', 'animal', 'field', 'ground', 'trunk', 'building', 'cow', 'sheep', 'zebra', 'zoo', 'feather', 'beak', 'post', 'grassy', 'horn', 'branch', 'hill', 'ear', 'bush', 'large', 'group', 'wing', 'area', 'spot', 'baby', 'goose', 'park', 'water', 'foot', 'enclosure', 'front', 'top', 'next', 'pole', 'boulder', 'stone', 'face', 'roof', 'standing'] 2022-03-17 05:24:21,929.929 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'same', 'ground', 'rock', 'neck', 'tree', 'leg', 'shadow', 'grass', 'tail', 'trunk', 'exhibit', 'fence', 'zoo', 'enclosure'] 2022-03-17 05:26:45,850.850 2829:trainer.py:487 do_train_dict(): eta: 7:23:38 iter: 51100 speed: 275.4 images/sec total_norm: 147.6986 (148.4872) loss: 141.5769 (142.8459) masked_loss: 1.4637 (1.5038) tag_loss: 140.3699 (141.3421) time: 1.4332 (1.8592) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4281 (1.8540) save_time: 8.8421 (15.3432) lr: 0.000023 max mem: 26307 2022-03-17 05:26:46,209.209 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.65625 2022-03-17 05:26:46,210.210 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 159.4515380859375 2022-03-17 05:26:46,210.210 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.46929217129946 2022-03-17 05:27:12,076.076 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02284674160182476 2022-03-17 05:27:12,076.076 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 05:27:12,077.077 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'woman', 'portuguese', 'her', 'horse', 'and', 'aleksandr', 'in', 'a', 'fence', '##d', 'field', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 05:27:12,092.092 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['grass', 'fence', 'dog', 'ear', 'leg', 'head', 'hair', 'girl', 'tail', 'field', 'horse', 'shirt', 'mane', '[UNK]', 'flower', 'post', 'woman', 'leash', 'hand', 'child', 'chain', 'face', 'arm', 'dress', 'neck', 'rope', 'person', 'black', 'mouth', 'brown', 'glasses', 'tongue', 'watch', 'back', 'collar', 'nose', 'tree', 'couple', 'short', 'eye', 'harness', 'foot', 'next', 'lady', 'wire', 'spot', 'pony', 'top', 'body', 'fur'] 2022-03-17 05:27:28,041.041 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'back', 'head', 'hand', 'woman', 'field', 'hair', 'girl', 'post', 'arm', 'horse', 'shirt', 'dog', 'leg', 'bag', 'ear', 'grass', 'tail', 'rope', 'fence', 'boot'] 2022-03-17 05:29:51,538.538 2829:trainer.py:487 do_train_dict(): eta: 7:20:48 iter: 51200 speed: 275.7 images/sec total_norm: 149.6668 (151.3736) loss: 140.0519 (139.6391) masked_loss: 1.3956 (1.4176) tag_loss: 138.5114 (138.2215) time: 1.4318 (1.8569) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4268 (1.8518) save_time: 8.8421 (15.3432) lr: 0.000023 max mem: 26307 2022-03-17 05:29:51,899.899 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4571428596973419 2022-03-17 05:29:51,899.899 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 151.3666534423828 2022-03-17 05:29:51,899.899 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.47377428348534 2022-03-17 05:30:17,620.620 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022847270593047142 2022-03-17 05:30:17,621.621 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 05:30:17,621.621 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'man', 'is', 'walking', 'in', 'a', '[MASK]', 'with', 'an', 'umbrella', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 05:30:17,636.636 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'umbrella', 'tree', 'person', 'street', 'ground', 'building', 'road', '[UNK]', 'leg', 'pole', 'man', 'truck', 'cloud', 'coat', 'car', 'sidewalk', 'fence', 'wheel', 'bag', 'foot', 'shoe', 'light', 'photo', 'bus', 'sign', 'jacket', 'tire', 'wire', 'shadow', 'woman', 'line', 'rain', 'van', 'boot', 'background', 'black', 'head', 'couple', 'roof', 'purse', 'rainy', 'white', 'picture', 'crane', 'wall', 'hand', 'lamp', 'post', 'brick'] 2022-03-17 05:30:33,639.639 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'road', 'street', 'light', 'ground', 'person', 'arm', 'base', 'van', 'foot', 'tree', 'sign', 'sky', 'bus', 'leg', 'bag', 'truck', 'billboard', 'wheel', 'coat', 'monument', 'cloud', 'statue', 'photo', 'pole', 'trailer', 'tent', 'courtyard', 'umbrella'] 2022-03-17 05:32:57,299.299 2829:trainer.py:487 do_train_dict(): eta: 7:17:57 iter: 51300 speed: 275.6 images/sec total_norm: 147.9342 (150.8931) loss: 142.3930 (144.0219) masked_loss: 1.4029 (1.4181) tag_loss: 140.9289 (142.6038) time: 1.4327 (1.8576) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4277 (1.8525) save_time: 8.8421 (15.3432) lr: 0.000023 max mem: 26307 2022-03-17 05:32:57,660.660 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4848484992980957 2022-03-17 05:32:57,660.660 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 144.57009887695312 2022-03-17 05:32:57,660.660 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.47661365701994 2022-03-17 05:33:23,353.353 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.0228738933801651 2022-03-17 05:33:23,355.355 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 05:33:23,355.355 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'very', 'tall', 'building', 'with', 'a', 'clock', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 05:33:23,371.371 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'tower', 'building', 'moon', 'window', 'clock', 'roof', 'wall', 'cross', 'top', '[UNK]', 'tree', 'light', 'spire', 'blue', 'pole', 'hand', 'cloud', 'tall', 'arch', 'background', 'church', 'bird', 'view', 'large', 'clear', 'street', 'statue', 'structure', 'flag', 'brick', 'city', 'vane', 'night', 'snow', 'archway', 'chimney', 'fence', 'door', 'house', 'doorway', 'weather', 'front', 'distance', 'branch', 'red', 'railing', 'person', 'day', 'leaf'] 2022-03-17 05:33:39,314.314 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['building', 'wall', 'cross', 'window', 'tower', 'sky', 'tall', 'moon', 'roof', 'clock', 'pillar'] 2022-03-17 05:36:03,064.064 2829:trainer.py:487 do_train_dict(): eta: 7:15:07 iter: 51400 speed: 275.6 images/sec total_norm: 147.7725 (150.5090) loss: 141.1639 (140.6418) masked_loss: 1.4593 (1.4557) tag_loss: 139.9064 (139.1860) time: 1.4317 (1.8577) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4263 (1.8525) save_time: 8.8421 (15.3432) lr: 0.000023 max mem: 26307 2022-03-17 05:36:03,425.425 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6285714507102966 2022-03-17 05:36:03,426.426 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 132.674072265625 2022-03-17 05:36:03,426.426 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.47598486594784 2022-03-17 05:36:29,253.253 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.0229461919516325 2022-03-17 05:36:29,253.253 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 05:36:29,254.254 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'riding', 'ski', '##s', 'across', 'a', 'snow', 'covered', 'slope', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 05:36:29,269.269 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'tree', 'snow', 'jacket', 'ski', 'man', 'pole', 'ground', 'fence', 'head', 'boot', 'glove', 'face', 'coat', 'glasses', 'sky', 'hand', 'leg', 'foot', 'person', 'building', 'trunk', 'skier', 'poles', 'hair', 'shoe', 'hat', 'background', 'snowy', 'house', 'track', 'hill', 'slope', 'helmet', 'arm', 'top', 'strap', 'backpack', 'hood', 'cloud', 'woman', 'mountain', 'stick', 'post', 'roof', 'black', 'next', 'bench', 'sign', 'guy'] 2022-03-17 05:36:45,168.168 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'house', 'hand', 'face', 'building', 'ground', 'hair', 'tree', 'sign', 'snow', 'coat', 'pole', 'jacket', 'hood', 'glasses', 'ski', 'fence', 'boot', 'slope', 'glove'] 2022-03-17 05:39:08,972.972 2829:trainer.py:487 do_train_dict(): eta: 7:12:17 iter: 51500 speed: 275.4 images/sec total_norm: 148.2207 (150.6410) loss: 141.6054 (140.1483) masked_loss: 1.4545 (1.4760) tag_loss: 139.9645 (138.6723) time: 1.4320 (1.8591) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4270 (1.8539) save_time: 8.8421 (15.3432) lr: 0.000022 max mem: 26307 2022-03-17 05:39:09,336.336 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5757575631141663 2022-03-17 05:39:09,336.336 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 126.5184555053711 2022-03-17 05:39:09,336.336 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.48144597046135 2022-03-17 05:39:35,358.358 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022993769496679306 2022-03-17 05:39:35,358.358 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 05:39:35,359.359 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', '[MASK]', 'two', 'bikes', 'crosses', 'the', 'road', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 05:39:35,374.374 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['man', 'person', 'bicycle', 'bike', 'shirt', '[UNK]', 'building', 'street', 'cone', 'hair', 'vest', 'chain', 'hat', 'road', 'line', 'shoe', 'stripe', 'tire', 'jacket', 'hand', 'sign', 'window', 'arm', 'head', 'tree', 'pole', 'woman', 'background', 'wheel', 'traffic', 'collar', 'cap', 'logo', 'jean', 'motorcycle', 'barrier', 'sky', 'backpack', 'camera', 'light', 'letter', 'mirror', 'city', 'basket', 'wall', 'back', 'barrel', 'bag', 'crowd', 'group'] 2022-03-17 05:39:51,336.336 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'line', 'building', 'road', 'street', 'hair', 'person', 'wall', 'window', 'sign', 'shirt', 'chain', 'wheel', 'hat', 'cap', 'bike', 'logo', 'fence', 'collar', 'bicycle', 'shoe', 'tire', 'cone', 'pedal', 'vest', 'stripe'] 2022-03-17 05:42:15,114.114 2829:trainer.py:487 do_train_dict(): eta: 7:09:26 iter: 51600 speed: 275.1 images/sec total_norm: 148.3121 (151.9691) loss: 141.3788 (142.1773) masked_loss: 1.4018 (1.4124) tag_loss: 139.9413 (140.7649) time: 1.4323 (1.8614) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4271 (1.8563) save_time: 8.8421 (15.3432) lr: 0.000022 max mem: 26307 2022-03-17 05:42:15,475.475 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7878788113594055 2022-03-17 05:42:15,475.475 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 151.30194091796875 2022-03-17 05:42:15,475.475 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.48433929531902 2022-03-17 05:42:41,580.580 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022996308282017708 2022-03-17 05:42:41,581.581 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 05:42:41,581.581 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'young', 'and', 'old', 'person', 'are', 'playing', '[MASK]', 'games', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 05:42:41,596.596 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['glasses', 'hand', 'hair', 'woman', 'shirt', 'wall', 'face', 'window', 'arm', 'sweater', '[UNK]', 'controller', 'jean', 'remote', 'room', 'head', 'lady', 'game', 'couch', 'floor', 'blind', 'mouth', 'table', 'nose', 'ceiling', 'chair', 'leg', 'pillow', 'girl', 'watch', 'door', 'shelf', 'ear', 'curtain', 'sofa', 'box', 'wii', 'light', 'book', 'wrist', 'bottle', 'picture', 'handle', 'bag', 'video', 'cabinet', 'plant', 'shoe', 'person', 'belt'] 2022-03-17 05:42:57,584.584 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'hand', 'old', 'face', 'room', 'young', 'woman', 'cup', 'living', 'hair', 'girl', 'video', 'person', 'table', 'wall', 'arm', 'boy', 'chair', 'window', 'watch', 'shirt', 'kid', 'ottoman', 'blind', 'couch', 'glasses', 'skirt', 'pillow', 'sofa', 'sweater', 'cushion'] 03-17 05:43:18.123 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-17 05:43:18.123 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-17 05:43:19.494 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-17 05:45:20,986.986 2829:trainer.py:487 do_train_dict(): eta: 7:06:36 iter: 51700 speed: 275.5 images/sec total_norm: 147.2572 (150.6359) loss: 139.5363 (139.7787) masked_loss: 1.4213 (1.4252) tag_loss: 138.2551 (138.3536) time: 1.4312 (1.8587) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4262 (1.8535) save_time: 8.8421 (15.3432) lr: 0.000022 max mem: 26307 2022-03-17 05:45:21,347.347 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5142857432365417 2022-03-17 05:45:21,348.348 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 135.24557495117188 2022-03-17 05:45:21,348.348 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.48552903429422 2022-03-17 05:45:47,344.344 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022976456210017204 2022-03-17 05:45:47,345.345 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 05:45:47,345.345 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'number', 'of', 'colorful', 'kite', '[MASK]', 'flying', 'under', 'a', 'ramps', 'sky', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 05:45:47,360.360 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['kite', 'sky', 'man', 'shirt', 'person', 'hair', 'string', 'flag', 'head', 'woman', '[UNK]', 'grass', 'field', 'tent', 'sunglasses', 'jacket', 'crowd', 'ground', 'building', 'tail', 'hat', 'face', 'pole', 'bicycle', 'air', 'ear', 'number', 'park', 'jean', 'glasses', 'tree', 'eye', 'hand', 'child', 'group', 'fence', 'balloon', 'bike', 'large', 'coat', 'cloud', 'beach', 'shadow', 'bag', 'wall', 'street', 'arm', 'sign', 'hill', 'boy'] 2022-03-17 05:46:03,346.346 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'number', 'hair', 'blue', 'person', 'tree', 'sky', 'shirt', 'crowd', 'string', 'jacket', 'bike', 'colorful', 'kite'] 2022-03-17 05:48:27,042.042 2829:trainer.py:487 do_train_dict(): eta: 7:03:45 iter: 51800 speed: 275.2 images/sec total_norm: 147.3880 (150.1413) loss: 139.3582 (140.6168) masked_loss: 1.4529 (1.4726) tag_loss: 138.3924 (139.1443) time: 1.4310 (1.8605) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4257 (1.8554) save_time: 8.8421 (15.3432) lr: 0.000022 max mem: 26307 2022-03-17 05:48:27,403.403 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6000000238418579 2022-03-17 05:48:27,403.403 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 110.99067687988281 2022-03-17 05:48:27,403.403 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.49655274113708 2022-03-17 05:48:53,571.571 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.0229787640273571 2022-03-17 05:48:53,572.572 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 05:48:53,572.572 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'bush', 'filled', 'with', '[MASK]', 'of', 'purple', 'flowers', 'near', 'water', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 05:48:53,587.587 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['flower', 'sky', 'water', 'plant', 'bench', 'tree', 'hill', 'grass', 'cloud', 'field', 'bush', 'building', 'person', '[UNK]', 'city', 'boat', 'background', 'bridge', 'pole', 'garden', 'lake', 'park', 'shirt', 'sidewalk', 'front', 'ground', 'large', 'tower', 'river', 'blue', 'next', 'top', 'trunk', 'man', 'bicycle', 'dirt', 'head', 'woman', 'post', 'fence', 'green', 'house', 'branch', 'light', 'lamp', 'leaf', 'boy', 'pot', 'umbrella', 'road'] 2022-03-17 05:49:09,480.480 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'water', 'building', 'field', 'hill', 'plant', 'train', 'tree', 'sky', 'shirt', 'wheel', 'grass', 'bush', 'cloud', 'purple', 'flower', 'bench', 'bike', 'bicycle'] 2022-03-17 05:51:33,098.098 2829:trainer.py:487 do_train_dict(): eta: 7:00:55 iter: 51900 speed: 275.2 images/sec total_norm: 148.6713 (154.2171) loss: 140.7731 (140.5863) masked_loss: 1.4384 (1.4408) tag_loss: 139.3752 (139.1456) time: 1.4319 (1.8606) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4269 (1.8555) save_time: 8.8421 (15.3432) lr: 0.000022 max mem: 26307 2022-03-17 05:51:33,460.460 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7941176295280457 2022-03-17 05:51:33,460.460 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 129.7630615234375 2022-03-17 05:51:33,460.460 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.49959303782536 2022-03-17 05:51:59,600.600 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022981395944952965 2022-03-17 05:51:59,600.600 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 05:51:59,601.601 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'dirty', 'white', 'toilet', 'filled', 'with', 'beverage', 'containers', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 05:51:59,616.616 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['toilet', 'floor', 'water', 'bowl', 'wall', 'lid', 'seat', 'bathroom', 'tile', 'head', '[UNK]', 'arm', 'hair', 'handle', 'shirt', 'hand', 'tail', 'small', 'line', 'cap', 'baby', 'tank', 'ear', 'toy', 'short', 'child', 'paper', 'beak', 'man', 'trash', 'can', 'person', 'metal', 'shoe', 'white', 'foot', 'boy', 'ground', 'animal', 'pipe', 'body', 'leg', 'bear', 'brush', 'bird', 'top', 'glove', 'towel', 'cup', 'green'] 2022-03-17 05:52:15,566.566 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['water', 'body', 'white', 'floor', 'wall', 'seat', 'bowl', 'bottle', 'cap', 'dirty', 'pole', 'shoe', 'toilet', 'lid', 'beverage'] 2022-03-17 05:54:39,196.196 2829:trainer.py:487 do_train_dict(): eta: 6:58:04 iter: 52000 speed: 275.1 images/sec total_norm: 148.8579 (152.1922) loss: 140.9233 (140.0620) masked_loss: 1.4459 (1.4548) tag_loss: 139.4069 (138.6072) time: 1.4319 (1.8610) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4266 (1.8558) save_time: 8.8421 (15.3432) lr: 0.000022 max mem: 26307 2022-03-17 05:54:39,557.557 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6176470518112183 2022-03-17 05:54:39,557.557 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 148.58297729492188 2022-03-17 05:54:39,557.557 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.49819077182411 2022-03-17 05:55:05,623.623 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02299141138792038 2022-03-17 05:55:05,623.623 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 05:55:05,624.624 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'objects', 'are', 'sitting', 'on', 'top', 'of', 'a', 'glass', 'table', '.', '##yx', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 05:55:05,639.639 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'hand', 'can', 'man', 'couch', 'floor', 'shirt', 'shoe', 'hair', 'leg', 'person', 'rug', 'table', 'head', 'carpet', 'glasses', 'wall', 'face', 'bag', 'arm', 'chair', 'bottle', 'jean', 'phone', 'trash', 'ear', 'jacket', 'nose', 'cap', 'watch', 'glass', 'remote', 'blanket', 'mat', 'room', 'box', 'cup', 'screen', 'wheel', 'sofa', 'book', 'boy', 'beer', 'finger', 'door', 'cell', 'controller', 'shelf', 'soda', 'mouth'] 2022-03-17 05:55:21,634.634 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'can', 'man', 'hand', 'top', 'control', 'person', 'floor', 'table', 'wall', 'phone', 'glass', 'box', 'cell', 'cd', 'jean', 'shirt', 'label', 'bag', 'bowl', 'beer', 'bottle', 'cap', 'couch', 'remote', 'shoe', 'candle', 'jar'] 2022-03-17 05:57:45,465.465 2829:trainer.py:487 do_train_dict(): eta: 6:55:13 iter: 52100 speed: 274.9 images/sec total_norm: 149.3275 (152.0727) loss: 139.6746 (140.1645) masked_loss: 1.4134 (1.4525) tag_loss: 138.1430 (138.7120) time: 1.4317 (1.8627) data: 0.0001 (0.0005) to_device: 0.0051 (0.0050) time_gpu: 1.4268 (1.8572) save_time: 8.8421 (15.3432) lr: 0.000022 max mem: 26307 2022-03-17 05:57:45,826.826 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6363636255264282 2022-03-17 05:57:45,826.826 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 144.1970672607422 2022-03-17 05:57:45,826.826 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.49746650754264 2022-03-17 05:58:12,149.149 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.022987600415945053 2022-03-17 05:58:12,150.150 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 05:58:12,150.150 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'three', '[MASK]', '[MASK]', 'ski', '##s', 'are', 'standing', 'on', 'a', 'slope', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 05:58:12,165.165 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['jacket', '[UNK]', 'sky', 'pole', 'ski', 'glove', 'snow', 'person', 'head', 'coat', 'hand', 'hat', 'woman', 'man', 'skier', 'boot', 'ground', 'zipper', 'slope', 'face', 'helmet', 'hair', 'outfit', 'leg', 'poles', 'arm', 'couple', 'sunglasses', 'cap', 'snowy', 'hill', 'foot', 'scarf', 'top', 'group', 'girl', 'hood', 'shadow', 'shoe', 'shirt', 'other', 'backpack', 'orange', 'tree', 'glasses', 'line', 'side', 'skiing', 'female', 'day'] 2022-03-17 05:58:28,162.162 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'hand', 'woman', 'ground', 'hair', 'person', 'sky', 'snow', 'wheel', 'coat', 'hat', 'pole', 'jacket', 'ski', 'boot', 'slope', 'helmet', 'glove', 'skier'] 2022-03-17 06:00:51,661.661 2829:trainer.py:487 do_train_dict(): eta: 6:52:22 iter: 52200 speed: 275.0 images/sec total_norm: 145.5194 (147.7424) loss: 136.4378 (137.5636) masked_loss: 1.4403 (1.4753) tag_loss: 135.0829 (136.0882) time: 1.4325 (1.8619) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4274 (1.8567) save_time: 8.8421 (15.3432) lr: 0.000021 max mem: 26307 2022-03-17 06:00:52,020.020 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-17 06:00:52,020.020 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 147.0973358154297 2022-03-17 06:00:52,021.021 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.50326824735274 2022-03-17 06:01:18,552.552 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023031720891594887 2022-03-17 06:01:18,553.553 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 06:01:18,553.553 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'person', 'is', 'doing', 'something', 'that', '[MASK]', 'very', 'interesting', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 06:01:18,568.568 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'grass', 'ear', 'sheep', 'fence', 'hair', 'sky', 'arm', 'boy', 'wall', 'tree', 'head', 'gravel', 'leg', 'stump', 'door', 'ground', 'shadow', 'animal', 'field', 'log', 'post', 'sleeve', 'face', 'barn', '[UNK]', 'background', 'hand', 'rock', 'building', 'young', 'child', 'tail', 'wool', 'cloud', 'lamb', 'car', 'road', 'person', 'gate', 'small', 'wood', 'dirt', 'camera', 'pole', 'goat', 'baby', 'grazing', 'little', 'dog'] 2022-03-17 06:01:34,630.630 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'something', 'door', 'ground', 'hair', 'post', 'person', 'wall', 'arm', 'boy', 'tree', 'sky', 'shirt', 'animal', 'leg', 'ear', 'shadow', 'grass', 'tail', 'interesting', 'sheep', 'fence', 'log', 'elbow', 'sleeve', 'gravel', 'stump'] 2022-03-17 06:03:57,977.977 2829:trainer.py:487 do_train_dict(): eta: 6:49:32 iter: 52300 speed: 274.8 images/sec total_norm: 148.7513 (150.3584) loss: 138.9591 (140.4302) masked_loss: 1.4191 (1.4284) tag_loss: 137.8493 (139.0018) time: 1.4328 (1.8631) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4278 (1.8581) save_time: 8.8421 (15.3432) lr: 0.000021 max mem: 26307 2022-03-17 06:03:58,342.342 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6764705777168274 2022-03-17 06:03:58,342.342 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 183.23245239257812 2022-03-17 06:03:58,342.342 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.49200868242569 2022-03-17 06:04:24,737.737 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023032614961266518 2022-03-17 06:04:24,738.738 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 06:04:24,738.738 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'bunch', 'of', '[MASK]', 'animals', 'stacked', 'on', 'top', '[MASK]', 'each', 'other', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 06:04:24,753.753 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['bear', 'window', 'nose', 'teddy', 'building', 'head', 'ear', 'sign', 'wall', 'eye', 'sidewalk', 'shirt', 'handle', '[UNK]', 'motorcycle', 'ground', 'ribbon', 'store', 'bike', 'door', 'curb', 'face', 'uniform', 'chair', 'man', 'hair', 'animal', 'wheel', 'light', 'leg', 'arm', 'hat', 'floor', 'person', 'bow', 'mouth', 'blue', 'car', 'tag', 'street', 'stuffed', 'jacket', 'bag', 'road', 'tire', 'bat', 'logo', 'hand', 'stripe', 'foot'] 2022-03-17 06:04:40,691.691 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['other', 'head', 'hand', 'face', 'building', 'top', 'book', 'mouth', 'wall', 'seat', 'arm', 'smile', 'eye', 'chair', 'foot', 'window', 'box', 'sign', 'shirt', 'teeth', 'animal', 'nose', 'ear', 'bear', 'uniform', 'tag', 'bat', 'patch', 'bunch', 'monkey', 'doll', 'teddy', 'stuffed', 'lid'] 2022-03-17 06:07:04,118.118 2829:trainer.py:487 do_train_dict(): eta: 6:46:41 iter: 52400 speed: 275.1 images/sec total_norm: 148.5133 (151.8342) loss: 139.5278 (140.1417) masked_loss: 1.3528 (1.4331) tag_loss: 137.5445 (138.7086) time: 1.4311 (1.8614) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4259 (1.8562) save_time: 8.8421 (15.3432) lr: 0.000021 max mem: 26307 2022-03-17 06:07:04,479.479 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4722222089767456 2022-03-17 06:07:04,480.480 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 142.93185424804688 2022-03-17 06:07:04,480.480 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.49530140468052 2022-03-17 06:07:30,901.901 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02307848446071148 2022-03-17 06:07:30,901.901 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 06:07:30,902.902 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'bear', 'is', 'on', 'the', 'pillow', 'and', 'a', 'jacket', 'is', '[MASK]', '[MASK]', 'bed', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 06:07:30,917.917 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['bed', 'window', 'pillow', 'blanket', 'bear', 'blind', 'room', 'teddy', 'ear', 'sheet', 'tree', 'wall', 'bedroom', 'head', '[UNK]', 'lamp', 'clothes', 'animal', 'shirt', 'stuffed', 'curtain', 'nightstand', 'foot', 'book', 'shade', 'arm', 'person', 'light', 'cat', 'nose', 'floor', 'large', 'table', 'leg', 'top', 'jacket', 'clock', 'chair', 'next', 'post', 'picture', 'tail', 'couple', 'paw', 'small', 'paper', 'dresser', 'white', 'cover', 'building'] 2022-03-17 06:07:46,828.828 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['room', 'bed', 'window', 'tree', 'shirt', 'clothes', 'ear', 'bear', 'blind', 'jacket', 'blanket', 'pillow', 'lamp', 'teddy'] 2022-03-17 06:10:10,588.588 2829:trainer.py:487 do_train_dict(): eta: 6:43:50 iter: 52500 speed: 274.6 images/sec total_norm: 150.9259 (152.7547) loss: 138.8904 (139.2738) masked_loss: 1.4213 (1.4351) tag_loss: 137.1731 (137.8387) time: 1.4336 (1.8647) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4286 (1.8596) save_time: 8.8421 (15.3432) lr: 0.000021 max mem: 26307 2022-03-17 06:10:10,949.949 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7272727489471436 2022-03-17 06:10:10,949.949 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 127.15536499023438 2022-03-17 06:10:10,949.949 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.50250674200602 2022-03-17 06:10:37,455.455 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023070955649018288 2022-03-17 06:10:37,455.455 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 06:10:37,455.455 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'group', 'of', 'young', 'people', 'are', 'crossing', 'the', '[MASK]', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 06:10:37,471.471 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sign', 'light', 'tree', 'street', 'pole', 'car', 'sky', 'building', 'cone', 'sidewalk', 'road', 'person', 'traffic', '[UNK]', 'line', 'city', 'man', 'can', 'window', 'store', 'shirt', 'truck', 'fire', 'woman', 'shadow', 'arrow', 'curb', 'trash', 'intersection', 'van', 'flag', 'mountain', 'box', 'bag', 'bench', 'fence', 'jacket', 'stop', 'jean', 'wall', 'barrier', 'balcony', 'lamp', 'suv', 'roof', 'booth', 'corner', 'letter', 'cover', 'cart'] 2022-03-17 06:10:53,403.403 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'can', 'city', 'man', 'group', 'line', 'building', 'road', 'street', 'young', 'light', 'woman', 'car', 'hair', 'person', 'window', 'tree', 'store', 'sign', 'sky', 'jean', 'shirt', 'bus', 'traffic', 'bag', 'hat', 'pole', 'purse', 'balcony', 'cone'] 2022-03-17 06:13:17,095.095 2829:trainer.py:487 do_train_dict(): eta: 6:40:59 iter: 52600 speed: 274.5 images/sec total_norm: 149.9317 (151.5256) loss: 140.9038 (140.4717) masked_loss: 1.4318 (1.4907) tag_loss: 139.1198 (138.9810) time: 1.4329 (1.8651) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4279 (1.8600) save_time: 8.8421 (15.3432) lr: 0.000021 max mem: 26307 2022-03-17 06:13:17,455.455 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6363636255264282 2022-03-17 06:13:17,455.455 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 132.94970703125 2022-03-17 06:13:17,456.456 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.50741239109799 03-17 06:13:19.595 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-17 06:13:19.595 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-17 06:13:20.315 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}] 2022-03-17 06:13:43,694.694 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02310938946902752 2022-03-17 06:13:43,695.695 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 06:13:43,695.695 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'small', 'tropical', 'umbrella', 'sits', 'on', '[MASK]', 'patch', 'of', 'grass', 'at', 'the', 'beach', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 06:13:43,710.710 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sand', 'umbrella', 'beach', 'wave', 'shadow', 'ground', 'rug', 'flower', 'towel', 'person', 'water', 'butterfly', '[UNK]', 'grass', 'design', 'pole', 'carpet', 'shore', 'leaf', 'sun', 'ball', 'background', 'handle', 'rock', 'mat', 'green', 'bird', 'sky', 'light', 'colorful', 'post', 'ocean', 'cloud', 'object', 'sandy', 'patch', 'moss', 'snow', 'puddle', 'little', 'logo', 'open', 'wing', 'inside', 'man', 'piece', 'next', 'body', 'top', 'footprint'] 2022-03-17 06:13:59,715.715 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['small', 'ground', 'person', 'beach', 'sky', 'background', 'wave', 'tropical', 'shadow', 'sand', 'grass', 'flower', 'patch', 'towel', 'umbrella', 'rug'] 2022-03-17 06:16:23,595.595 2829:trainer.py:487 do_train_dict(): eta: 6:38:08 iter: 52700 speed: 274.5 images/sec total_norm: 149.2330 (151.5919) loss: 142.3542 (142.5090) masked_loss: 1.4248 (1.4977) tag_loss: 140.8760 (141.0114) time: 1.4324 (1.8651) data: 0.0001 (0.0002) to_device: 0.0052 (0.0051) time_gpu: 1.4271 (1.8598) save_time: 8.8421 (15.3432) lr: 0.000021 max mem: 26307 2022-03-17 06:16:23,955.955 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6857143044471741 2022-03-17 06:16:23,956.956 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 130.64234924316406 2022-03-17 06:16:23,956.956 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.51453000126463 2022-03-17 06:16:50,185.185 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023102670907974243 2022-03-17 06:16:50,186.186 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 06:16:50,186.186 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'boy', 'with', 'a', 'uno', '##pen', '##ed', 'tooth', '##brush', 'in', 'his', 'mouth', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 06:16:50,201.201 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'ear', 'nose', 'boy', 'eye', 'hand', 'hair', 'mouth', 'head', '[UNK]', 'face', 'wall', 'writing', 'door', 'sleeve', 'arm', 'handle', 'frame', 'brush', 'logo', 'design', 'letter', 'tooth', 'finger', 'child', 'table', 'light', 'doorway', 'picture', 'baby', 'curtain', 'young', 'window', 'background', 'box', 'lettering', 'small', 'short', 'shelf', 'button', 'room', 'word', 'little', 'blind', 'blue', 'bottle', 'kid', 'bag', 'pajamas', 'toy'] 2022-03-17 06:17:06,153.153 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'can', 'head', 'hand', 'little', 'face', 'top', 'hair', 'mouth', 'wall', 'arm', 'boy', 'writing', 'eye', 'letter', 'shirt', 'nose', 'ear', 'logo', 'sleeve', 'container', 'microphone'] 2022-03-17 06:19:31,827.827 2829:trainer.py:487 do_train_dict(): eta: 6:35:18 iter: 52800 speed: 272.0 images/sec total_norm: 149.3551 (151.0412) loss: 141.7490 (141.3010) masked_loss: 1.4350 (1.4581) tag_loss: 140.5589 (139.8429) time: 1.4330 (1.8822) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4278 (1.8770) save_time: 8.8421 (15.3432) lr: 0.000020 max mem: 26307 2022-03-17 06:19:32,188.188 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6388888955116272 2022-03-17 06:19:32,188.188 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 139.81195068359375 2022-03-17 06:19:32,188.188 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.51842562958514 2022-03-17 06:19:58,497.497 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023098180070519447 2022-03-17 06:19:58,497.497 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 06:19:58,497.497 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'tennis', 'player', 'trying', 'to', 'hit', 'the', 'ball', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 06:19:58,513.513 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'line', '[UNK]', 'court', 'tennis', 'hand', 'man', 'shoe', 'short', 'arm', 'head', 'leg', 'sock', 'hair', 'ball', 'player', 'net', 'logo', 'shadow', 'ground', 'person', 'pole', 'sign', 'letter', 'knee', 'banner', 'handle', 'face', 'air', 'blue', 'cap', 'band', 'chair', 'white', 'foot', 'stand', 'uniform', 'wall', 'hat', 'male', 'stripe', 'match', 'string', 'beard', 'outfit', 'sleeve', 'top', 'fence', 'writing', 'bag'] 2022-03-17 06:20:14,438.438 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'line', 'player', 'court', 'short', 'hair', 'post', 'arm', 'ball', 'shirt', 'leg', 'handle', 'tennis', 'shadow', 'net', 'pole', 'shoe', 'sock'] 2022-03-17 06:22:38,402.402 2829:trainer.py:487 do_train_dict(): eta: 6:32:27 iter: 52900 speed: 274.4 images/sec total_norm: 148.2379 (153.0615) loss: 135.4847 (137.7019) masked_loss: 1.4718 (1.4906) tag_loss: 133.8302 (136.2114) time: 1.4328 (1.8658) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4276 (1.8606) save_time: 8.8421 (15.3432) lr: 0.000020 max mem: 26307 2022-03-17 06:22:38,763.763 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5454545617103577 2022-03-17 06:22:38,764.764 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 145.037353515625 2022-03-17 06:22:38,764.764 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.52613963720934 2022-03-17 06:23:05,421.421 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023095954209566116 2022-03-17 06:23:05,421.421 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 06:23:05,421.421 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'picture', 'of', 'some', '[MASK]', 'sitting', 'down', 'at', '[MASK]', 'table', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 06:23:05,437.437 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['table', 'hair', 'shirt', 'hand', 'chair', 'box', 'window', 'napkin', 'girl', 'cup', 'wall', '[UNK]', 'paper', 'head', 'lid', 'watch', 'woman', 'plate', 'food', 'face', 'glasses', 'bag', 'container', 'arm', 'sandwich', 'necklace', 'coffee', 'tray', 'fork', 'sunglasses', 'boy', 'eye', 'ear', 'nose', 'phone', 'tissue', 'restaurant', 'mouth', 'child', 'purse', 'book', 'floor', 'hamburger', 'top', 'glass', 'bread', 'door', 'person', 'bottle', 'jar'] 2022-03-17 06:23:21,403.403 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'hand', 'book', 'woman', 'cup', 'hair', 'girl', 'mouth', 'child', 'table', 'wall', 'arm', 'eye', 'chair', 'paper', 'window', 'watch', 'box', 'jean', 'shirt', 'picture', 'bag', 'toy', 'shoe', 'container', 'tray', 'lid', 'jar', 'bunny', 'napkin'] 2022-03-17 06:25:45,020.020 2829:trainer.py:487 do_train_dict(): eta: 6:29:36 iter: 53000 speed: 274.4 images/sec total_norm: 147.5545 (150.3079) loss: 139.4126 (139.8593) masked_loss: 1.4222 (1.4393) tag_loss: 138.0947 (138.4200) time: 1.4325 (1.8661) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4274 (1.8610) save_time: 8.8421 (15.3432) lr: 0.000020 max mem: 26307 2022-03-17 06:25:45,381.381 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7575757503509521 2022-03-17 06:25:45,381.381 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 164.57278442382812 2022-03-17 06:25:45,381.381 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.52162080610314 2022-03-17 06:26:11,898.898 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02310321480035782 2022-03-17 06:26:11,899.899 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 06:26:11,899.899 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'group', 'of', 'gi', '##raf', '##fe', '[MASK]', '[MASK]', 'on', 'a', 'green', 'field', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 06:26:11,914.914 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'grass', 'sky', '[UNK]', 'leg', 'log', 'neck', 'shadow', 'bush', 'zoo', 'head', 'ground', 'trunk', 'enclosure', 'rock', 'tail', 'pole', 'fence', 'field', 'group', 'branch', 'animal', 'stump', 'wood', 'green', 'area', 'dirt', 'park', 'wall', 'post', 'grassy', 'lush', 'next', 'herd', 'flower', 'water', 'hair', 'mane', 'zebra', 'other', 'large', 'sunny', 'person', 'plant', 'couple', 'hay', 'horn', 'basket', 'open', 'spot'] 2022-03-17 06:26:27,799.799 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'group', 'field', 'ground', 'rock', 'green', 'neck', 'tree', 'wood', 'sky', 'leg', 'shadow', 'grass', 'tail', 'bush', 'trunk', 'log', 'zoo', 'enclosure', 'stump'] 2022-03-17 06:28:51,661.661 2829:trainer.py:487 do_train_dict(): eta: 6:26:44 iter: 53100 speed: 274.3 images/sec total_norm: 149.5856 (152.1965) loss: 135.7643 (138.6308) masked_loss: 1.4620 (1.4428) tag_loss: 134.8812 (137.1880) time: 1.4317 (1.8664) data: 0.0001 (0.0002) to_device: 0.0052 (0.0051) time_gpu: 1.4268 (1.8612) save_time: 8.8421 (15.3432) lr: 0.000020 max mem: 26307 2022-03-17 06:28:52,022.022 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6764705777168274 2022-03-17 06:28:52,023.023 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 159.529052734375 2022-03-17 06:28:52,023.023 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.51662395592022 2022-03-17 06:29:18,827.827 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023088660091161728 2022-03-17 06:29:18,827.827 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 06:29:18,828.828 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'the', '[MASK]', 'teddy', 'bears', 'are', 'sitting', 'together', 'on', 'the', 'red', 'table', '##cloth', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 06:29:18,843.843 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['foot', 'bear', 'teddy', 'head', 'nose', 'arm', 'eye', 'ear', 'paw', 'leg', 'stuffed', 'table', 'face', '[UNK]', 'towel', 'shirt', 'box', 'floor', 'wall', 'bench', 'toy', 'book', 'ground', 'mat', 'chair', 'toe', 'animal', 'doll', 'ball', 'cloth', 'reflection', 'bow', 'paper', 'ribbon', 'hand', 'bag', 'stripe', 'pillow', 'sock', 'hat', 'napkin', 'tree', 'tie', 'plant', 'container', 'shoe', 'hair', 'cushion', 'white', 'man'] 2022-03-17 06:29:34,882.882 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'face', 'red', 'rock', 'seat', 'arm', 'eye', 'neck', 'foot', 'window', 'tree', 'shirt', 'leg', 'clothes', 'nose', 'ear', 'bear', 'grass', 'bush', 'bench', 'toy', 'pillow', 'towel', 'ribbon', 'teddy', 'scarf', 'paw'] 2022-03-17 06:31:58,707.707 2829:trainer.py:487 do_train_dict(): eta: 6:23:53 iter: 53200 speed: 273.7 images/sec total_norm: 149.3245 (152.8166) loss: 141.7967 (143.0419) masked_loss: 1.3289 (1.3933) tag_loss: 140.7359 (141.6486) time: 1.4337 (1.8704) data: 0.0001 (0.0005) to_device: 0.0050 (0.0049) time_gpu: 1.4284 (1.8651) save_time: 8.8421 (15.3432) lr: 0.000020 max mem: 26307 2022-03-17 06:31:59,067.067 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5757575631141663 2022-03-17 06:31:59,068.068 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 162.45693969726562 2022-03-17 06:31:59,068.068 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.51993309579244 2022-03-17 06:32:25,947.947 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023116614669561386 2022-03-17 06:32:25,948.948 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 06:32:25,948.948 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'black', '[MASK]', '[MASK]', 'outside', 'a', 'wooden', 'door', 'on', 'bricks', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 06:32:25,963.963 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wheel', 'tire', 'motorcycle', 'wall', 'seat', 'brick', 'engine', 'ground', '[UNK]', 'building', 'floor', 'bike', 'tank', 'tile', 'fender', 'sidewalk', 'light', 'pipe', 'gas', 'door', 'logo', 'exhaust', 'spoke', 'handle', 'plate', 'mirror', 'window', 'plant', 'next', 'sign', 'chain', 'motor', 'front', 'black', 'license', 'weed', 'stand', 'red', 'curtain', 'fence', 'stone', 'pedal', 'pot', 'curb', 'tail', 'shadow', 'garage', 'rim', 'side', 'stain'] 2022-03-17 06:32:41,877.877 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'building', 'door', 'light', 'ground', 'rock', 'wall', 'seat', 'stone', 'engine', 'window', 'gas', 'wooden', 'tank', 'plate', 'wheel', 'mirror', 'brick', 'tail', 'pole', 'pipe', 'motorcycle', 'sidewalk', 'tire', 'mat', 'fender'] 2022-03-17 06:35:05,450.450 2829:trainer.py:487 do_train_dict(): eta: 6:21:02 iter: 53300 speed: 274.2 images/sec total_norm: 148.2015 (151.5175) loss: 138.6734 (139.3751) masked_loss: 1.4129 (1.4551) tag_loss: 137.6418 (137.9200) time: 1.4328 (1.8674) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4276 (1.8624) save_time: 8.8421 (15.3432) lr: 0.000020 max mem: 26307 2022-03-17 06:35:05,812.812 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6111111044883728 2022-03-17 06:35:05,813.813 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 164.28720092773438 2022-03-17 06:35:05,813.813 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.51550155096733 2022-03-17 06:35:32,375.375 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023114699870347977 2022-03-17 06:35:32,376.376 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 06:35:32,376.376 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'woman', 'on', '[MASK]', 'pair', 'of', 'skies', 'during', '[MASK]', 'competition', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 06:35:32,391.391 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['boot', 'ski', 'snow', 'pole', '[UNK]', 'glove', 'leg', 'hair', 'ground', 'woman', 'hand', 'head', 'number', 'shirt', 'skier', 'vest', 'face', 'helmet', 'person', 'foot', 'arm', 'suit', 'logo', 'snowy', 'slope', 'girl', 'hat', 'hill', 'jacket', 'tree', 'outfit', 'red', 'sleeve', 'sky', 'letter', 'shin', 'top', 'line', 'man', 'stick', 'ponytail', 'flag', 'sign', 'track', 'skiing', 'pants', 'competitive', 'course', 'guard', 'cap'] 2022-03-17 06:35:48,360.360 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'hand', 'number', 'face', 'woman', 'ground', 'hair', 'competition', 'foot', 'guard', 'shirt', 'pair', 'leg', 'snow', 'pole', 'jacket', 'logo', 'ski', 'boot', 'helmet', 'shin', 'glove', 'vest', 'skier'] 2022-03-17 06:38:12,447.447 2829:trainer.py:487 do_train_dict(): eta: 6:18:11 iter: 53400 speed: 273.8 images/sec total_norm: 149.7664 (153.5950) loss: 138.5418 (139.5406) masked_loss: 1.3497 (1.3930) tag_loss: 137.0514 (138.1477) time: 1.4332 (1.8700) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4279 (1.8649) save_time: 8.8421 (15.3432) lr: 0.000020 max mem: 26307 2022-03-17 06:38:12,808.808 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.44117647409439087 2022-03-17 06:38:12,808.808 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 151.84719848632812 2022-03-17 06:38:12,808.808 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.51551540053893 2022-03-17 06:38:39,758.758 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023107659071683884 2022-03-17 06:38:39,758.758 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 06:38:39,758.758 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'holding', 'a', '[MASK]', 'bat', '[MASK]', 'children', 'and', 'other', 'adults', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 06:38:39,774.774 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hair', 'jean', 'head', 'boy', 'shirt', 'wall', '[UNK]', 'vest', 'cabinet', 'kitchen', 'jacket', 'bat', 'hand', 'couch', 'table', 'man', 'woman', 'light', 'child', 'chair', 'ear', 'hat', 'girl', 'lamp', 'person', 'handle', 'balloon', 'baseball', 'dress', 'ceiling', 'cup', 'hood', 'door', 'vase', 'shelf', 'face', 'sleeve', 'sink', 'rack', 'towel', 'pillow', 'cloth', 'flower', 'kid', 'bag', 'paper', 'can', 'floor', 'microwave', 'plant'] 2022-03-17 06:38:55,650.650 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'other', 'can', 'head', 'man', 'hand', 'door', 'light', 'woman', 'hair', 'girl', 'person', 'table', 'wall', 'boy', 'paper', 'jean', 'shirt', 'kitchen', 'dress', 'handle', 'cabinet', 'hat', 'couch', 'jacket', 'bat', 'hood', 'towel', 'lamp', 'rack', 'fixture', 'vest', 'foam', 'leash'] 2022-03-17 06:41:19,659.659 2829:trainer.py:487 do_train_dict(): eta: 6:15:20 iter: 53500 speed: 273.5 images/sec total_norm: 147.9064 (149.4840) loss: 135.6014 (137.6662) masked_loss: 1.3493 (1.4345) tag_loss: 133.8281 (136.2318) time: 1.4336 (1.8721) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4283 (1.8669) save_time: 8.8421 (15.3432) lr: 0.000019 max mem: 26307 2022-03-17 06:41:20,022.022 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.375 2022-03-17 06:41:20,022.022 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 135.85379028320312 2022-03-17 06:41:20,022.022 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.51670611794316 2022-03-17 06:41:47,055.055 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023115666583180428 2022-03-17 06:41:47,055.055 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 06:41:47,055.055 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'sidewalk', 'outside', 'muscles', '[MASK]', 'winery', 'with', 'tables', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 06:41:47,071.071 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['window', 'building', 'sign', 'sidewalk', 'wall', 'sky', 'pole', 'car', 'table', 'chair', 'light', 'tree', 'street', 'shadow', '[UNK]', 'door', 'brick', 'curb', 'glass', 'reflection', 'store', 'city', 'restaurant', 'ground', 'bench', 'mat', 'bike', 'post', 'night', 'person', 'line', 'basket', 'bicycle', 'letter', 'base', 'empty', 'side', 'front', 'dirt', 'can', 'tile', 'meter', 'paper', 'parking', 'road', 'pipe', 'trash', 'large', 'patio', 'man'] 2022-03-17 06:42:03,008.008 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['name', 'building', 'door', 'light', 'car', 'wall', 'chair', 'window', 'tree', 'sign', 'sky', 'pole', 'sidewalk', 'winery'] 03-17 06:43:20.405 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-17 06:43:20.405 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-17 06:43:21.369 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-17 06:44:26,711.711 2829:trainer.py:487 do_train_dict(): eta: 6:12:28 iter: 53600 speed: 273.7 images/sec total_norm: 148.2527 (151.9876) loss: 139.1617 (140.3812) masked_loss: 1.3356 (1.4347) tag_loss: 137.4768 (138.9464) time: 1.4322 (1.8705) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4269 (1.8654) save_time: 8.8421 (15.3432) lr: 0.000019 max mem: 26307 2022-03-17 06:44:27,070.070 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-17 06:44:27,071.071 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 130.1796875 2022-03-17 06:44:27,071.071 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.51425576520809 2022-03-17 06:44:54,166.166 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023094050586223602 2022-03-17 06:44:54,166.166 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 06:44:54,166.166 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'large', 'bohemia', 'is', 'attending', 'a', '93', 'game', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 06:44:54,182.182 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['person', 'stadium', 'shirt', 'stand', '[UNK]', 'man', 'line', 'player', 'wall', 'hat', 'tennis', 'court', 'game', 'sky', 'net', 'sign', 'shoe', 'head', 'field', 'cap', 'grass', 'chair', 'uniform', 'short', 'umpire', 'crowd', 'spectator', 'advertisement', 'fence', 'hair', 'match', 'woman', 'bag', 'logo', 'camera', 'pole', 'leg', 'shadow', 'building', 'arm', 'ball', 'cooler', 'stair', 'banner', 'catcher', 'outfit', 'baseball', 'roof', 'billboard', 'screen'] 2022-03-17 06:45:10,179.179 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'game', 'line', 'large', 'player', 'court', 'short', 'field', 'hair', 'person', 'wall', 'stand', 'chair', 'stadium', 'sky', 'shirt', 'audience', 'roof', 'tennis', 'net', 'hat', 'shoe'] 2022-03-17 06:47:33,852.852 2829:trainer.py:487 do_train_dict(): eta: 6:09:37 iter: 53700 speed: 273.6 images/sec total_norm: 148.8559 (150.8806) loss: 138.2961 (137.3568) masked_loss: 1.4703 (1.4358) tag_loss: 137.1160 (135.9210) time: 1.4334 (1.8714) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4282 (1.8662) save_time: 8.8421 (15.3432) lr: 0.000019 max mem: 26307 2022-03-17 06:47:34,215.215 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6875 2022-03-17 06:47:34,216.216 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 114.52116394042969 2022-03-17 06:47:34,216.216 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.52387236839776 2022-03-17 06:48:01,303.303 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023098932579159737 2022-03-17 06:48:01,304.304 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 06:48:01,304.304 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'this', 'is', 'someone', '##s', 'office', 'inside', '[MASK]', 'their', 'home', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 06:48:01,320.320 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['window', 'wall', 'shelf', 'computer', 'table', 'desk', 'monitor', 'paper', 'book', 'glass', 'mouse', '[UNK]', 'keyboard', 'laptop', 'logo', 'room', 'cord', 'box', 'speaker', 'picture', 'lamp', 'pen', 'painting', 'bottle', 'wire', 'office', 'phone', 'pad', 'screen', 'cd', 'sign', 'door', 'cup', 'can', 'water', 'apple', 'coffee', 'chair', 'light', 'frame', 'stand', 'glasses', 'printer', 'floor', 'reflection', 'bowl', 'coaster', 'handle', 'curtain', 'cabinet'] 2022-03-17 06:48:17,312.312 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'home', 'room', 'book', 'office', 'table', 'wall', 'glass', 'paper', 'computer', 'window', 'box', 'picture', 'painting', 'desk', 'cabinet', 'speaker', 'liquid', 'pen', 'wire', 'mouse', 'monitor', 'logo', 'keyboard', 'lamp', 'shelf', 'cord', 'pad', 'laptop', 'printer', 'coaster', 'vase'] 2022-03-17 06:50:41,047.047 2829:trainer.py:487 do_train_dict(): eta: 6:06:45 iter: 53800 speed: 273.5 images/sec total_norm: 148.5597 (150.7829) loss: 137.6516 (138.9184) masked_loss: 1.4509 (1.4525) tag_loss: 136.3108 (137.4659) time: 1.4347 (1.8719) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4294 (1.8668) save_time: 8.8421 (15.3432) lr: 0.000019 max mem: 26307 2022-03-17 06:50:41,408.408 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6176470518112183 2022-03-17 06:50:41,409.409 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 156.27928161621094 2022-03-17 06:50:41,409.409 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.51999815054417 2022-03-17 06:51:08,269.269 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023086512461304665 2022-03-17 06:51:08,269.269 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 06:51:08,269.269 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'white', 'plate', 'with', 'the', 'remains', 'of', 'cake', 'and', 'ice', 'cream', 'on', 'it', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 06:51:08,285.285 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['table', 'plate', 'cake', '[UNK]', 'bowl', 'bread', 'food', 'shadow', 'cream', 'napkin', 'dessert', 'ice', 'chocolate', 'fork', 'meat', 'white', 'crust', 'spoon', 'handle', 'sauce', 'piece', 'paper', 'sandwich', 'glass', 'top', 'eaten', 'knife', 'half', 'slice', 'pastry', 'cup', 'coffee', 'water', 'whipped', 'dish', 'topping', 'desert', 'container', 'design', 'cloth', 'sugar', 'close', 'next', 'object', 'light', 'pie', 'cut', 'small', 'side', 'layer'] 2022-03-17 06:51:24,178.178 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'white', 'cup', 'table', 'food', 'ice', 'bowl', 'handle', 'plate', 'shadow', 'cream', 'bread', 'fork', 'cake', 'sauce'] 2022-03-17 06:53:48,199.199 2829:trainer.py:487 do_train_dict(): eta: 6:03:54 iter: 53900 speed: 273.6 images/sec total_norm: 148.3664 (150.3349) loss: 135.0100 (135.2947) masked_loss: 1.3557 (1.3908) tag_loss: 134.3097 (133.9039) time: 1.4326 (1.8716) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4273 (1.8664) save_time: 8.8421 (15.3432) lr: 0.000019 max mem: 26307 2022-03-17 06:53:48,561.561 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5714285969734192 2022-03-17 06:53:48,561.561 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 132.25933837890625 2022-03-17 06:53:48,561.561 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.5328981116966 2022-03-17 06:54:15,532.532 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023074764758348465 2022-03-17 06:54:15,532.532 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 06:54:15,532.532 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'of', 'people', 'sand', '##ing', 'around', '[MASK]', 'kitchen', 'having', 'conversation', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 06:54:15,548.548 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hair', 'shirt', '[UNK]', 'kitchen', 'window', 'cabinet', 'woman', 'floor', 'man', 'hand', 'sweater', 'watch', 'jean', 'person', 'girl', 'door', 'apple', 'shelf', 'bowl', 'stove', 'head', 'basket', 'pot', 'bottle', 'wall', 'refrigerator', 'oven', 'bag', 'plate', 'food', 'glasses', 'fruit', 'arm', 'can', 'table', 'bracelet', 'shoe', 'drawer', 'microwave', 'box', 'light', 'picture', 'face', 'sink', 'chair', 'rack', 'towel', 'handle', 'lady', 'cup'] 2022-03-17 06:54:31,499.499 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'group', 'hand', 'door', 'light', 'woman', 'hair', 'girl', 'person', 'floor', 'wall', 'food', 'arm', 'lady', 'window', 'watch', 'box', 'shirt', 'kitchen', 'picture', 'dress', 'conversation', 'bag', 'bowl', 'cabinet', 'fan', 'ceiling', 'apple', 'glasses', 'pot', 'boot', 'basket', 'shelf', 'container', 'necklace', 'drawer', 'sweater', 'banana', 'oven', 'refrigerator', 'microwave', 'bracelet'] 2022-03-17 06:56:55,586.586 2829:trainer.py:487 do_train_dict(): eta: 6:01:02 iter: 54000 speed: 273.2 images/sec total_norm: 148.0509 (151.0156) loss: 138.7195 (139.3287) masked_loss: 1.3462 (1.3543) tag_loss: 137.3788 (137.9744) time: 1.4326 (1.8739) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4273 (1.8687) save_time: 8.8421 (15.3432) lr: 0.000019 max mem: 26307 2022-03-17 06:56:55,946.946 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7142857313156128 2022-03-17 06:56:55,947.947 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 156.38815307617188 2022-03-17 06:56:55,947.947 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.52628975831206 2022-03-17 06:57:22,803.803 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023100974038243294 2022-03-17 06:57:22,803.803 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 06:57:22,803.803 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', '[MASK]', 'hanging', 'over', 'a', 'city', 'street', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 06:57:22,819.819 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'pole', 'light', 'building', 'tree', 'street', 'car', 'road', 'sign', 'window', 'sidewalk', 'traffic', 'line', 'curb', '[UNK]', 'city', 'person', 'arrow', 'intersection', 'shadow', 'roof', 'post', 'fire', 'store', 'tire', 'corner', 'signal', 'bush', 'lamp', 'green', 'red', 'truck', 'suv', 'bus', 'man', 'flag', 'tail', 'van', 'house', 'box', 'median', 'busy', 'door', 'fence', 'clock', 'wall', 'chimney', 'jacket', 'cover', 'town'] 2022-03-17 06:57:38,710.710 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'city', 'line', 'building', 'road', 'street', 'light', 'woman', 'car', 'post', 'person', 'window', 'tree', 'box', 'store', 'sign', 'sky', 'jean', 'traffic', 'pole', 'jacket', 'globe', 'lamp', 'sidewalk', 'curb'] 2022-03-17 07:00:02,819.819 2829:trainer.py:487 do_train_dict(): eta: 5:58:11 iter: 54100 speed: 273.5 images/sec total_norm: 148.3124 (148.5386) loss: 141.6317 (142.7959) masked_loss: 1.4132 (1.4406) tag_loss: 140.1342 (141.3553) time: 1.4324 (1.8723) data: 0.0001 (0.0002) to_device: 0.0052 (0.0051) time_gpu: 1.4270 (1.8670) save_time: 8.8421 (15.3432) lr: 0.000019 max mem: 26307 2022-03-17 07:00:03,184.184 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.4285714328289032 2022-03-17 07:00:03,184.184 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 187.35125732421875 2022-03-17 07:00:03,185.185 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.51709695935689 2022-03-17 07:00:30,438.438 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023073064163327217 2022-03-17 07:00:30,438.438 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 07:00:30,439.439 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'navy', '[MASK]', 'is', 'moore', '##d', 'in', 'front', 'of', 'woods', '[MASK]', 'a', 'clock', 'tower', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 07:00:30,454.454 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'water', 'boat', 'sky', 'building', 'bridge', 'tower', 'harbor', 'background', '[UNK]', 'tire', 'person', 'dock', 'flag', 'forest', 'pole', 'window', 'reflection', 'large', 'structure', 'roof', 'sign', 'car', 'stripe', 'mast', 'life', 'river', 'bird', 'palm', 'box', 'man', 'number', 'crane', 'shore', 'light', 'house', 'next', 'bottom', 'body', 'cabin', 'small', 'writing', 'pier', 'wall', 'post', 'wheel', 'other', 'ball', 'line', 'lamp'] 2022-03-17 07:00:46,495.495 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'number', 'water', 'building', 'front', 'bridge', 'navy', 'tree', 'tower', 'letter', 'sign', 'sky', 'bottom', 'boat', 'background', 'roof', 'clock', 'flag', 'vessel', 'pole', 'cabin', 'dock', 'tire', 'stair'] 2022-03-17 07:03:10,141.141 2829:trainer.py:487 do_train_dict(): eta: 5:55:19 iter: 54200 speed: 273.3 images/sec total_norm: 149.4554 (151.9599) loss: 133.5153 (135.4099) masked_loss: 1.4301 (1.4873) tag_loss: 132.0351 (133.9226) time: 1.4322 (1.8732) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4272 (1.8681) save_time: 8.8421 (15.3432) lr: 0.000018 max mem: 26307 2022-03-17 07:03:10,500.500 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5625 2022-03-17 07:03:10,500.500 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 137.13644409179688 2022-03-17 07:03:10,500.500 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.52164006101492 2022-03-17 07:03:37,476.476 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023063212633132935 2022-03-17 07:03:37,476.476 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 07:03:37,477.477 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'couple', 'of', 'people', 'skiing', '[MASK]', 'a', 'snowy', 'slope', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 07:03:37,492.492 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['ski', 'pole', '[UNK]', 'tree', 'person', 'snow', 'track', 'ground', 'hair', 'jacket', 'leg', 'skier', 'woman', 'backpack', 'branch', 'country', 'cross', 'boot', 'hand', 'hat', 'foot', 'trail', 'hill', 'head', 'snowy', 'sign', 'slope', 'boy', 'arm', 'girl', 'poles', 'coat', 'trunk', 'shirt', 'glove', 'path', 'child', 'sky', 'pine', 'skiing', 'man', 'bush', 'couple', 'side', 'line', 'wood', 'hood', 'way', 'wooded', 'shoe'] 2022-03-17 07:03:53,376.376 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'ground', 'hair', 'track', 'person', 'child', 'couple', 'tree', 'sign', 'trail', 'snow', 'coat', 'pole', 'jacket', 'ski', 'slope', 'poles', 'backpack', 'snowy', 'skier'] 2022-03-17 07:06:17,399.399 2829:trainer.py:487 do_train_dict(): eta: 5:52:28 iter: 54300 speed: 273.4 images/sec total_norm: 148.4166 (151.8802) loss: 136.9527 (139.2305) masked_loss: 1.4078 (1.4522) tag_loss: 135.5449 (137.7783) time: 1.4327 (1.8726) data: 0.0001 (0.0005) to_device: 0.0051 (0.0050) time_gpu: 1.4275 (1.8671) save_time: 8.8421 (15.3432) lr: 0.000018 max mem: 26307 2022-03-17 07:06:17,759.759 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6060606241226196 2022-03-17 07:06:17,759.759 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 149.07948303222656 2022-03-17 07:06:17,760.760 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.5228221206104 2022-03-17 07:06:44,990.990 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023059379309415817 2022-03-17 07:06:44,991.991 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 07:06:44,991.991 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'in', 'a', 'vintage', 'banker', 'suit', 'is', 'leaning', 'against', 'a', 'vehicle', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 07:06:45,007.007 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'button', 'man', 'sleeve', 'vest', 'nose', 'tie', 'head', 'arm', 'face', 'belt', '[UNK]', 'collar', 'mouth', 'hair', 'ear', 'eye', 'suit', 'buckle', 'jacket', 'pocket', 'wall', 'hand', 'window', 'glasses', 'watch', 'phone', 'hat', 'name', 'sunglasses', 'neck', 'knot', 'tag', 'person', 'wrist', 'building', 'background', 'car', 'coat', 'beard', 'sky', 'shadow', 'cell', 'logo', 'black', 'tree', 'curtain', 'cuff', 'grass', 'chin'] 2022-03-17 07:07:00,906.906 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'name', 'hand', 'face', 'hair', 'wall', 'arm', 'neck', 'window', 'shirt', 'label', 'picture', 'vehicle', 'nose', 'suit', 'frame', 'handle', 'tie', 'belt', 'blind', 'tag', 'button', 'jacket', 'sleeve', 'banker', 'vintage', 'sunglasses', 'vest', 'mustache'] 2022-03-17 07:09:24,628.628 2829:trainer.py:487 do_train_dict(): eta: 5:49:36 iter: 54400 speed: 273.5 images/sec total_norm: 148.6420 (150.2953) loss: 137.3252 (139.1413) masked_loss: 1.4169 (1.4307) tag_loss: 135.7573 (137.7105) time: 1.4322 (1.8723) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4270 (1.8671) save_time: 8.8421 (15.3432) lr: 0.000018 max mem: 26307 2022-03-17 07:09:24,988.988 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.65625 2022-03-17 07:09:24,988.988 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 140.4403076171875 2022-03-17 07:09:24,988.988 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.52352770184159 2022-03-17 07:09:52,068.068 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02305726893246174 2022-03-17 07:09:52,069.069 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 07:09:52,069.069 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'glass', 'of', '[MASK]', 'and', 'white', 'wine', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 07:09:52,084.084 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['table', 'glass', 'stem', 'wine', 'base', '[UNK]', 'light', 'liquid', 'red', 'bottle', 'wall', 'handle', 'object', 'bowl', 'rim', 'background', 'close', 'next', 'shadow', 'reflection', 'white', 'paper', 'top', 'cup', 'plate', 'counter', 'knife', 'wooden', 'water', 'empty', 'ring', 'bubble', 'flower', 'person', 'small', 'bunch', 'floor', 'surface', 'label', 'full', 'fork', 'food', 'glasses', 'knot', 'napkin', 'view', 'sit', 'blade', 'image', 'other'] 2022-03-17 07:10:07,953.953 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['white', 'red', 'cup', 'table', 'glass', 'wine', 'bowl', 'liquid', 'stem'] 2022-03-17 07:12:32,327.327 2829:trainer.py:487 do_train_dict(): eta: 5:46:44 iter: 54500 speed: 272.8 images/sec total_norm: 148.6849 (151.0165) loss: 137.1681 (138.1163) masked_loss: 1.4510 (1.4560) tag_loss: 135.2661 (136.6602) time: 1.4332 (1.8770) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4281 (1.8718) save_time: 8.8421 (15.3432) lr: 0.000018 max mem: 26307 2022-03-17 07:12:32,688.688 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6363636255264282 2022-03-17 07:12:32,688.688 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 163.749267578125 2022-03-17 07:12:32,688.688 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.52617822025285 2022-03-17 07:13:00,024.024 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023123163729906082 2022-03-17 07:13:00,024.024 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 07:13:00,025.025 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'holding', 'up', 'holding', 'an', 'object', 'inside', 'of', 'a', 'plastic', '[MASK]', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 07:13:00,040.040 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['man', 'hand', 'shirt', 'head', 'beard', 'nose', 'sign', 'necklace', 'hair', 'wall', 'door', 'building', 'face', 'eye', 'arm', 'mustache', 'neck', 'fence', 'phone', 'chain', 'box', 'window', 'sky', 'letter', 'floor', 'cell', 'ground', 'ear', '[UNK]', 'jean', 'mouth', 'bracelet', 'chair', 'front', 'handle', 'wrist', 'tree', 'facial', 'house', 'post', 'camera', 'collar', 'watch', 'next', 'sleeve', 'picture', 'roof', 'frame', 'table', 'pole'] 2022-03-17 07:13:16,055.055 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'can', 'head', 'man', 'hand', 'face', 'building', 'door', 'short', 'inside', 'case', 'ground', 'hair', 'mouth', 'floor', 'wall', 'arm', 'phone', 'eye', 'chair', 'neck', 'box', 'letter', 'sign', 'sky', 'shirt', 'nose', 'object', 'plastic', 'fence', 'beard', 'necklace', 'stool', 'mustache'] 03-17 07:13:21.469 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-17 07:13:21.469 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-17 07:13:22.537 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-17 07:15:39,560.560 2829:trainer.py:487 do_train_dict(): eta: 5:43:52 iter: 54600 speed: 273.5 images/sec total_norm: 150.4979 (153.6734) loss: 138.9823 (139.6637) masked_loss: 1.3454 (1.3863) tag_loss: 136.8902 (138.2775) time: 1.4325 (1.8724) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4274 (1.8671) save_time: 8.8421 (15.3432) lr: 0.000018 max mem: 26307 2022-03-17 07:15:39,921.921 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-17 07:15:39,921.921 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 158.3876953125 2022-03-17 07:15:39,922.922 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.53041911430289 2022-03-17 07:16:07,467.467 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02310135029256344 2022-03-17 07:16:07,468.468 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 07:16:07,468.468 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'passenger', 'train', '[MASK]', 'on', 'train', 'tracks', '##ccus', 'overhead', 'wires', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 07:16:07,484.484 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['train', 'window', 'track', 'door', 'sky', 'number', 'wire', 'crane', 'wheel', 'building', 'flag', 'light', 'top', 'roof', 'car', 'structure', 'tower', '[UNK]', 'vent', 'sign', 'black', 'logo', 'clock', 'pole', 'silver', 'line', 'cable', 'metal', 'railroad', 'church', 'platform', 'gravel', 'bottom', 'bridge', 'letter', 'old', 'front', 'chimney', 'power', 'person', 'passenger', 'rail', 'antenna', 'man', 'station', 'wall', 'wood', 'step', 'board', 'blue'] 2022-03-17 07:16:23,469.469 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['number', 'building', 'top', 'door', 'track', 'window', 'train', 'letter', 'sky', 'bottom', 'passenger', 'clock', 'flag', 'wheel', 'wire', 'logo', 'overhead', 'ladder', 'crane'] 2022-03-17 07:18:47,167.167 2829:trainer.py:487 do_train_dict(): eta: 5:41:00 iter: 54700 speed: 272.9 images/sec total_norm: 150.6016 (153.3920) loss: 139.6840 (141.2588) masked_loss: 1.4077 (1.4399) tag_loss: 138.3964 (139.8189) time: 1.4327 (1.8761) data: 0.0001 (0.0002) to_device: 0.0052 (0.0051) time_gpu: 1.4273 (1.8708) save_time: 8.8421 (15.3432) lr: 0.000018 max mem: 26307 2022-03-17 07:18:47,528.528 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6857143044471741 2022-03-17 07:18:47,529.529 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 143.51622009277344 2022-03-17 07:18:47,529.529 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.53192005714361 2022-03-17 07:19:15,056.056 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02309594489634037 2022-03-17 07:19:15,057.057 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 07:19:15,057.057 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'bears', 'climbing', '[MASK]', 'rocks', 'in', '[MASK]', 'snow', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 07:19:15,073.073 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['bear', 'ear', 'nose', 'head', 'ground', 'eye', 'rock', 'leg', 'mouth', 'paw', 'snow', 'brown', 'back', 'face', 'log', 'snout', 'fur', 'wood', 'claw', 'shadow', 'tree', 'post', 'water', 'polar', 'large', 'wall', 'foot', 'grass', 'enclosure', 'pole', 'trunk', 'zoo', 'fence', 'tongue', 'reflection', 'tail', 'neck', 'dirt', 'stone', 'other', 'couple', 'big', 'pen', '[UNK]', 'stick', 'knot', 'next', 'boulder', 'animal', 'furry'] 2022-03-17 07:19:31,021.021 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'face', 'ground', 'rock', 'mouth', 'eye', 'nose', 'ear', 'bear', 'snow', 'log', 'moss', 'paw'] 2022-03-17 07:21:54,770.770 2829:trainer.py:487 do_train_dict(): eta: 5:38:08 iter: 54800 speed: 272.9 images/sec total_norm: 148.0623 (151.0993) loss: 140.7924 (141.4211) masked_loss: 1.4195 (1.4604) tag_loss: 139.1667 (139.9607) time: 1.4329 (1.8760) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4278 (1.8708) save_time: 8.8421 (15.3432) lr: 0.000017 max mem: 26307 2022-03-17 07:21:55,132.132 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7142857313156128 2022-03-17 07:21:55,133.133 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 122.77622985839844 2022-03-17 07:21:55,133.133 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.54276910027954 2022-03-17 07:22:22,902.902 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023112192749977112 2022-03-17 07:22:22,903.903 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 07:22:22,903.903 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', '[MASK]', 'sink', 'mounted', 'to', 'the', 'side', 'of', 'a', 'white', 'wall', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 07:22:22,919.919 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'sink', 'floor', 'hole', 'base', 'paint', 'bathroom', 'toilet', '[UNK]', 'leaf', 'bowl', 'shelf', 'pipe', 'ceiling', 'painting', 'ground', 'drain', 'plant', 'room', 'dirty', 'basin', 'tile', 'seat', 'window', 'tank', 'dirt', 'stand', 'trash', 'graffiti', 'broken', 'old', 'handle', 'building', 'white', 'line', 'flower', 'leg', 'tree', 'vase', 'lid', 'light', 'pedestal', 'small', 'table', 'tub', 'water', 'sculpture', 'soap', 'stain', 'reflection'] 2022-03-17 07:22:38,823.823 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'side', 'white', 'floor', 'wall', 'base', 'bowl', 'hole', 'paint', 'sink', 'shelf', 'toilet'] 2022-03-17 07:25:02,468.468 2829:trainer.py:487 do_train_dict(): eta: 5:35:16 iter: 54900 speed: 272.8 images/sec total_norm: 149.8817 (151.8378) loss: 137.2733 (137.8993) masked_loss: 1.3250 (1.3788) tag_loss: 135.8262 (136.5206) time: 1.4322 (1.8770) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4269 (1.8718) save_time: 8.8421 (15.3432) lr: 0.000017 max mem: 26307 2022-03-17 07:25:02,828.828 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5454545617103577 2022-03-17 07:25:02,829.829 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 124.87173461914062 2022-03-17 07:25:02,829.829 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.5487529893355 2022-03-17 07:25:30,442.442 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023135090246796608 2022-03-17 07:25:30,443.443 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 07:25:30,443.443 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'person', '[MASK]', 'surfing', 'in', 'a', '[MASK]', 'large', 'body', 'of', 'water', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 07:25:30,459.459 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['water', 'sky', 'man', 'ocean', 'person', 'kite', '[UNK]', 'boat', 'wave', 'island', 'wake', 'sail', 'rock', 'board', 'hill', 'land', 'body', 'horizon', 'para', 'surfer', 'mountain', 'tree', 'wind', 'large', 'stripe', 'surf', 'surfing', 'day', 'blue', 'distance', 'building', 'head', 'cloud', 'ski', 'paddle', 'sunny', 'group', 'short', 'middle', 'string', 'line', 'bird', 'top', 'object', 'beach', 'leg', 'clear', 'sailing', 'shirt', 'bush'] 2022-03-17 07:25:46,394.394 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'water', 'body', 'large', 'island', 'rock', 'person', 'sky', 'boat', 'ocean', 'wave', 'wake', 'horizon', 'kite'] 2022-03-17 07:28:10,049.049 2829:trainer.py:487 do_train_dict(): eta: 5:32:24 iter: 55000 speed: 273.0 images/sec total_norm: 149.1632 (153.8445) loss: 140.8947 (140.2621) masked_loss: 1.3745 (1.4348) tag_loss: 139.3336 (138.8274) time: 1.4329 (1.8758) data: 0.0001 (0.0002) to_device: 0.0050 (0.0050) time_gpu: 1.4277 (1.8706) save_time: 8.8421 (15.3432) lr: 0.000017 max mem: 26307 2022-03-17 07:28:10,051.051 2829:checkpoint.py:222 save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt 2022-03-17 07:28:19,121.121 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.800000011920929 2022-03-17 07:28:19,122.122 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 157.20748901367188 2022-03-17 07:28:19,122.122 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.55034812608778 2022-03-17 07:28:46,745.745 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02318420074880123 2022-03-17 07:28:46,745.745 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 07:28:46,746.746 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'toilet', 'and', 'a', 'sink', 'nazi', 'a', 'small', 'room', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 07:28:46,761.761 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'floor', 'toilet', 'flower', 'bowl', 'seat', 'cord', 'base', 'design', 'basket', 'plate', 'wire', 'star', 'table', 'lid', '[UNK]', 'shoe', 'bathroom', 'pipe', 'vase', 'paper', 'chair', 'handle', 'hole', 'water', 'bucket', 'ground', 'hose', 'black', 'fireplace', 'decoration', 'bag', 'room', 'sink', 'rope', 'object', 'white', 'rim', 'scissors', 'brush', 'container', 'floral', 'holder', 'mirror', 'person', 'line', 'pot', 'boot', 'strap', 'cup'] 2022-03-17 07:29:02,691.691 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['small', 'room', 'floor', 'star', 'table', 'wall', 'seat', 'base', 'bag', 'bowl', 'plate', 'bathroom', 'flower', 'wire', 'sink', 'pipe', 'boot', 'shoe', 'cord', 'toilet', 'bucket'] 2022-03-17 07:31:25,594.594 2829:trainer.py:487 do_train_dict(): eta: 5:29:34 iter: 55100 speed: 261.8 images/sec total_norm: 149.2752 (152.3086) loss: 139.8544 (139.5519) masked_loss: 1.4109 (1.4445) tag_loss: 138.5486 (138.1074) time: 1.4328 (1.9554) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4277 (1.8633) save_time: 8.8421 (14.7395) lr: 0.000017 max mem: 26307 2022-03-17 07:31:25,958.958 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.78125 2022-03-17 07:31:25,958.958 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 146.60842895507812 2022-03-17 07:31:25,958.958 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.54972153124602 2022-03-17 07:31:53,708.708 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02319576032459736 2022-03-17 07:31:53,708.708 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 07:31:53,708.708 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'wool', '##ly', 'sheep', 'are', 'in', 'a', 'friedman', 'field', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 07:31:53,724.724 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sheep', 'grass', 'fence', 'head', 'leg', 'field', 'post', 'ear', 'tree', 'pole', 'wool', 'face', '[UNK]', 'bush', 'green', 'eye', 'nose', 'tail', 'grazing', 'tag', 'building', 'person', 'trunk', 'dog', 'lamb', 'group', 'sky', 'animal', 'fur', 'grassy', 'background', 'herd', 'area', 'lush', 'mouth', 'pasture', 'large', 'standing', 'barn', 'body', 'hill', 'leaf', 'mane', 'white', 'next', 'wire', 'couple', 'coat', 'hay', 'wood'] 2022-03-17 07:32:09,685.685 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'field', 'green', 'post', 'tree', 'leg', 'ear', 'grass', 'bush', 'fur', 'pole', 'trunk', 'sheep', 'fence', 'hay', 'wool', 'grazing'] 2022-03-17 07:34:33,741.741 2829:trainer.py:487 do_train_dict(): eta: 5:26:42 iter: 55200 speed: 272.1 images/sec total_norm: 147.0503 (149.8581) loss: 136.7761 (138.6221) masked_loss: 1.3926 (1.4238) tag_loss: 135.2788 (137.1983) time: 1.4334 (1.8814) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4281 (1.8763) save_time: 8.8421 (14.7395) lr: 0.000017 max mem: 26307 2022-03-17 07:34:34,102.102 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.53125 2022-03-17 07:34:34,102.102 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 152.0072479248047 2022-03-17 07:34:34,102.102 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.55092213166773 2022-03-17 07:35:01,761.761 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023187007755041122 2022-03-17 07:35:01,762.762 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 07:35:01,762.762 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'blue', 'room', 'with', 'an', 'open', 'window', 'and', '[MASK]', 'bed', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 07:35:01,778.778 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['bed', 'window', 'wall', 'bear', 'teddy', 'pillow', '[UNK]', 'bedroom', 'sign', 'room', 'sheet', 'head', 'blanket', 'frame', 'animal', 'ceiling', 'camera', 'curtain', 'light', 'shelf', 'lamp', 'stuffed', 'ear', 'picture', 'flower', 'speaker', 'bow', 'nose', 'knob', 'arm', 'mirror', 'table', 'leaf', 'blind', 'foot', 'clock', 'post', 'fan', 'box', 'white', 'tree', 'floor', 'blue', 'paper', 'rail', 'shirt', 'outlet', 'small', 'toy', 'large'] 2022-03-17 07:35:17,732.732 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'room', 'open', 'blue', 'bed', 'wall', 'window', 'sign', 'bear', 'camera', 'ceiling', 'flower', 'sheet', 'pillow', 'curtain', 'shelf', 'teddy', 'paw'] 2022-03-17 07:37:41,506.506 2829:trainer.py:487 do_train_dict(): eta: 5:23:50 iter: 55300 speed: 272.7 images/sec total_norm: 150.5369 (151.4214) loss: 140.9055 (141.8900) masked_loss: 1.4811 (1.4752) tag_loss: 139.4649 (140.4148) time: 1.4327 (1.8777) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4277 (1.8726) save_time: 8.8421 (14.7395) lr: 0.000017 max mem: 26307 2022-03-17 07:37:41,867.867 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7058823704719543 2022-03-17 07:37:41,867.867 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 118.02864074707031 2022-03-17 07:37:41,867.867 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.55853754174409 2022-03-17 07:38:09,565.565 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02322113700211048 2022-03-17 07:38:09,565.565 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 07:38:09,565.565 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'woman', 'tennis', 'player', 'stands', 'ready', 'to', 'receive', 'the', 'ball', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 07:38:09,581.581 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['line', '[UNK]', 'tennis', 'court', 'hand', 'fence', 'woman', 'leg', 'shoe', 'hair', 'ground', 'handle', 'shirt', 'ball', 'head', 'tree', 'girl', 'skirt', 'bush', 'ponytail', 'bracelet', 'arm', 'pole', 'short', 'top', 'logo', 'face', 'player', 'wall', 'sock', 'person', 'young', 'net', 'hat', 'dress', 'grass', 'car', 'female', 'jacket', 'string', 'tank', 'wrist', 'stripe', 'lady', 'sign', 'trunk', 'necklace', 'mouth', 'roof', 'cap'] 2022-03-17 07:38:25,556.556 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'hand', 'line', 'player', 'woman', 'court', 'short', 'ground', 'hair', 'ready', 'tree', 'ball', 'shirt', 'leg', 'handle', 'tennis', 'bush', 'pole', 'fence', 'shoe'] 2022-03-17 07:40:49,691.691 2829:trainer.py:487 do_train_dict(): eta: 5:20:58 iter: 55400 speed: 272.1 images/sec total_norm: 148.6515 (150.2700) loss: 136.8090 (136.7787) masked_loss: 1.3725 (1.4105) tag_loss: 135.4121 (135.3681) time: 1.4333 (1.8818) data: 0.0001 (0.0005) to_device: 0.0051 (0.0050) time_gpu: 1.4280 (1.8762) save_time: 8.8421 (14.7395) lr: 0.000017 max mem: 26307 2022-03-17 07:40:50,053.053 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6486486196517944 2022-03-17 07:40:50,053.053 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 134.59735107421875 2022-03-17 07:40:50,054.054 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.55902457709784 2022-03-17 07:41:17,684.684 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02321443520486355 2022-03-17 07:41:17,684.684 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 07:41:17,684.684 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'group', 'of', 'people', 'standing', 'on', '[MASK]', 'beach', '[MASK]', 'to', 'an', 'orange', 'fence', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 07:41:17,700.700 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['man', 'net', '[UNK]', 'vest', 'person', 'jacket', 'hill', 'shirt', 'snow', 'ground', 'foot', 'woman', 'shadow', 'short', 'ski', 'rock', 'hand', 'leg', 'bush', 'glove', 'head', 'hat', 'pole', 'helmet', 'bag', 'hair', 'group', 'fence', 'girl', 'arm', 'tree', 'grass', 'cap', 'sand', 'bottom', 'coat', 'boot', 'top', 'shoe', 'suit', 'couple', 'beach', 'boy', 'board', 'scarf', 'child', 'rope', 'pipe', 'sky', 'stick'] 2022-03-17 07:41:33,622.622 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'life', 'group', 'hand', 'next', 'woman', 'short', 'ground', 'rock', 'board', 'person', 'arm', 'hill', 'couple', 'foot', 'beach', 'shirt', 'leg', 'bag', 'snow', 'orange', 'shadow', 'net', 'bush', 'hat', 'pole', 'jacket', 'ski', 'fence', 'helmet', 'shoe', 'glove', 'vest'] 03-17 07:43:22.637 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-17 07:43:22.637 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-17 07:43:23.700 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-17 07:43:57,697.697 2829:trainer.py:487 do_train_dict(): eta: 5:18:05 iter: 55500 speed: 272.3 images/sec total_norm: 149.0936 (151.6089) loss: 137.0255 (138.9479) masked_loss: 1.3884 (1.4175) tag_loss: 135.6696 (137.5304) time: 1.4344 (1.8801) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4292 (1.8749) save_time: 8.8421 (14.7395) lr: 0.000016 max mem: 26307 2022-03-17 07:43:58,057.057 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6969696879386902 2022-03-17 07:43:58,058.058 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 128.90350341796875 2022-03-17 07:43:58,058.058 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.56633572612735 2022-03-17 07:44:25,772.772 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023209838196635246 2022-03-17 07:44:25,772.772 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 07:44:25,773.773 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'the', 'young', 'kids', 'are', 'playing', '[MASK]', '##is', '[MASK]', 'together', 'outside', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 07:44:25,788.788 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'hair', 'bush', '[UNK]', 'boy', 'tree', 'sidewalk', 'short', 'flower', 'ground', 'shadow', 'leg', 'jean', 'girl', 'arm', 'shoe', 'gravel', 'hand', 'head', 'roof', 'stick', 'foot', 'young', 'building', 'woman', 'park', 'sleeve', 'child', 'house', 'rock', 'branch', 'leaf', 'sky', 'design', 'little', 'face', 'flip', 'glasses', 'grass', 'car', 'person', 'plant', 'wheel', 'small', 'ball', 'backpack', 'bag', 'flop', 'ear', 'red'] 2022-03-17 07:44:41,710.710 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'hand', 'park', 'young', 'short', 'car', 'ground', 'hair', 'girl', 'arm', 'boy', 'plant', 'tree', 'jean', 'shirt', 'leg', 'shadow', 'bush', 'flower', 'leaf', 'shoe', 'gravel', 'sidewalk', 'stripe'] 2022-03-17 07:47:05,536.536 2829:trainer.py:487 do_train_dict(): eta: 5:15:13 iter: 55600 speed: 272.6 images/sec total_norm: 148.3173 (150.0740) loss: 134.8123 (137.3642) masked_loss: 1.4788 (1.4751) tag_loss: 133.3455 (135.8891) time: 1.4321 (1.8784) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4268 (1.8732) save_time: 8.8421 (14.7395) lr: 0.000016 max mem: 26307 2022-03-17 07:47:05,896.896 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6060606241226196 2022-03-17 07:47:05,897.897 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 122.99308013916016 2022-03-17 07:47:05,897.897 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.56851624329599 2022-03-17 07:47:33,805.805 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023229800164699554 2022-03-17 07:47:33,805.805 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 07:47:33,805.805 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'pile', 'of', 'fresh', 'fruits', 'and', 'vegetables', 'on', 'top', '[MASK]', 'a', 'counter', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 07:47:33,821.821 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['apple', 'stem', '[UNK]', 'banana', 'squash', 'fruit', 'table', 'container', 'pumpkin', 'box', 'vegetable', 'leaf', 'onion', 'orange', 'pear', 'label', 'egg', 'potato', 'bunch', 'plastic', 'floor', 'shadow', 'bag', 'mushroom', 'top', 'other', 'logo', 'different', 'spot', 'green', 'lid', 'ground', 'crate', 'basket', 'bin', 'mango', 'bananas', 'counter', 'writing', 'plant', 'bottle', 'hole', 'next', 'letter', 'food', 'wall', 'variety', 'end', 'fresh', 'paper'] 2022-03-17 07:47:49,652.652 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'top', 'table', 'label', 'counter', 'fresh', 'bottle', 'fruit', 'apple', 'package', 'stem', 'pile', 'container', 'banana', 'vegetable', 'mushroom', 'squash', 'onion', 'pumpkin', 'pear'] 2022-03-17 07:50:13,270.270 2829:trainer.py:487 do_train_dict(): eta: 5:12:21 iter: 55700 speed: 272.7 images/sec total_norm: 149.8661 (152.7697) loss: 138.1550 (139.3135) masked_loss: 1.3633 (1.3793) tag_loss: 136.4007 (137.9342) time: 1.4328 (1.8773) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4277 (1.8721) save_time: 8.8421 (14.7395) lr: 0.000016 max mem: 26307 2022-03-17 07:50:13,629.629 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6764705777168274 2022-03-17 07:50:13,629.629 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 135.34750366210938 2022-03-17 07:50:13,630.630 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.56948369207348 2022-03-17 07:50:41,584.584 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023229990154504776 2022-03-17 07:50:41,585.585 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 07:50:41,585.585 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'men', 'that', '##heard', 'playing', 'a', 'wii', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 07:50:41,601.601 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'man', 'wall', 'chair', 'glasses', 'hand', 'jean', 'ceiling', 'controller', 'table', 'couch', 'remote', 'head', 'face', 'game', 'room', 'arm', 'beard', 'door', 'strap', '[UNK]', 'light', 'hat', 'hair', 'short', 'lamp', 'ear', 'video', 'wii', 'boy', 'computer', 'doorway', 'monitor', 'cap', 'pillow', 'jersey', 'switch', 'stripe', 'cord', 'sofa', 'desk', 'speaker', 'living', 'can', 'fan', 'bottle', 'number', 'logo', 'television', 'person'] 2022-03-17 07:50:57,526.526 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'game', 'face', 'room', 'short', 'hair', 'table', 'wall', 'arm', 'boy', 'chair', 'paper', 'plant', 'computer', 'jean', 'shirt', 'ear', 'frame', 'mirror', 'ceiling', 'couch', 'remote', 'doorway', 'glasses', 'monitor', 'blanket', 'beard', 'lamp', 'controller', 'strap'] 2022-03-17 07:53:21,191.191 2829:trainer.py:487 do_train_dict(): eta: 5:09:28 iter: 55800 speed: 272.5 images/sec total_norm: 148.3135 (150.1460) loss: 136.5807 (137.9012) masked_loss: 1.4268 (1.4673) tag_loss: 135.1069 (136.4339) time: 1.4324 (1.8793) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4276 (1.8741) save_time: 8.8421 (14.7395) lr: 0.000016 max mem: 26307 2022-03-17 07:53:21,551.551 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-17 07:53:21,551.551 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 127.00611114501953 2022-03-17 07:53:21,551.551 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.58090278884806 2022-03-17 07:53:49,643.643 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02324078604578972 2022-03-17 07:53:49,643.643 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 07:53:49,643.643 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'the', 'people', 'are', 'over', 'by', 'the', 'cows', 'in', 'the', 'water', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 07:53:49,659.659 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['cow', 'water', 'grass', 'animal', 'head', 'boy', 'short', 'bull', 'hill', 'man', 'horn', 'river', 'person', 'hair', 'rock', 'elephant', 'shirt', 'tree', 'ear', 'collar', 'bank', '[UNK]', 'nose', 'dirt', 'trunk', 'ground', 'bird', 'bush', 'field', 'sky', 'rope', 'stick', 'herd', 'cattle', 'tail', 'wall', 'group', 'mouth', 'house', 'dog', 'face', 'building', 'buffalo', 'body', 'child', 'shore', 'neck', 'moss', 'hat', 'ripple'] 2022-03-17 07:54:05,640.640 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'water', 'river', 'short', 'field', 'ground', 'rock', 'hair', 'person', 'boy', 'hill', 'neck', 'shirt', 'animal', 'nose', 'grass', 'bull', 'horn', 'collar', 'elephant', 'cow'] 2022-03-17 07:56:29,346.346 2829:trainer.py:487 do_train_dict(): eta: 5:06:36 iter: 55900 speed: 272.1 images/sec total_norm: 148.4833 (151.3380) loss: 138.5231 (139.6853) masked_loss: 1.3421 (1.3937) tag_loss: 137.0009 (138.2916) time: 1.4329 (1.8816) data: 0.0001 (0.0002) to_device: 0.0052 (0.0051) time_gpu: 1.4277 (1.8763) save_time: 8.8421 (14.7395) lr: 0.000016 max mem: 26307 2022-03-17 07:56:29,706.706 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5277777910232544 2022-03-17 07:56:29,706.706 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 137.6077117919922 2022-03-17 07:56:29,706.706 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.58596892356873 2022-03-17 07:56:57,672.672 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02323666401207447 2022-03-17 07:56:57,672.672 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 07:56:57,673.673 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'little', 'boy', 'is', 'wearing', 'ski', '##s', 'and', '[MASK]', 'large', '[MASK]', 'inside', 'a', 'house', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 07:56:57,688.688 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'helmet', 'hand', 'boy', 'sock', 'wall', 'floor', '[UNK]', 'boot', 'nose', 'face', 'strap', 'door', 'shadow', 'carpet', 'head', 'ski', 'leg', 'eye', 'rug', 'shoe', 'mouth', 'arm', 'child', 'mat', 'stripe', 'design', 'young', 'collar', 'sleeve', 'wheel', 'short', 'person', 'little', 'girl', 'pad', 'logo', 'word', 'board', 'jacket', 'chair', 'ground', 'knee', 'handle', 'ear', 'kid', 'pole', 'building', 'light', 'ball'] 2022-03-17 07:57:13,738.738 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'house', 'hand', 'little', 'face', 'large', 'door', 'short', 'mouth', 'floor', 'child', 'wall', 'arm', 'boy', 'eye', 'shirt', 'nose', 'shadow', 'ski', 'boot', 'sleeve', 'helmet', 'mat', 'strap', 'stripe', 'sock'] 2022-03-17 07:59:37,470.470 2829:trainer.py:487 do_train_dict(): eta: 5:03:43 iter: 56000 speed: 272.2 images/sec total_norm: 150.1007 (155.2089) loss: 137.1774 (137.8814) masked_loss: 1.4450 (1.4492) tag_loss: 135.9774 (136.4323) time: 1.4324 (1.8812) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4273 (1.8761) save_time: 8.8421 (14.7395) lr: 0.000016 max mem: 26307 2022-03-17 07:59:37,830.830 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6470588445663452 2022-03-17 07:59:37,831.831 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 124.09059143066406 2022-03-17 07:59:37,831.831 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.59260741997105 2022-03-17 08:00:05,924.924 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023231040686368942 2022-03-17 08:00:05,925.925 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 08:00:05,925.925 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'close', '-', 'up', 'of', 'heads', 'of', 'light', 'green', 'bro', '##cco', '##li', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 08:00:05,940.940 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'stem', 'hole', 'cloth', 'background', 'vegetable', 'head', 'light', 'leaf', 'green', 'water', 'food', 'close', 'flower', 'object', 'piece', 'bud', 'plate', 'white', 'table', 'seed', 'reflection', 'bunch', 'image', 'plant', 'top', 'shadow', 'bowl', 'surface', 'other', 'large', 'name', 'view', 'field', 'next', 'small', 'full', 'back', 'picture', 'bean', 'photo', 'couple', 'napkin', 'line', 'dark', 'item', 'wall', 'end', 'design', 'fish'] 2022-03-17 08:00:21,932.932 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'light', 'green', 'hole', 'stem', 'vegetable'] 2022-03-17 08:02:45,766.766 2829:trainer.py:487 do_train_dict(): eta: 5:00:51 iter: 56100 speed: 271.9 images/sec total_norm: 151.2056 (154.6599) loss: 136.3632 (137.6634) masked_loss: 1.4114 (1.4704) tag_loss: 135.1435 (136.1930) time: 1.4322 (1.8830) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4270 (1.8778) save_time: 8.8421 (14.7395) lr: 0.000016 max mem: 26307 2022-03-17 08:02:46,128.128 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5714285969734192 2022-03-17 08:02:46,128.128 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 156.21681213378906 2022-03-17 08:02:46,129.129 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.59144246790332 2022-03-17 08:03:14,426.426 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023218417540192604 2022-03-17 08:03:14,426.426 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 08:03:14,427.427 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'couple', 'of', 'zebra', 'standing', 'next', 'to', 'each', 'other', 'on', 'a', 'field', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 08:03:14,442.442 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['zebra', 'tree', 'sky', 'grass', 'leg', 'field', 'cloud', 'ear', 'tail', 'ground', 'mane', 'head', 'bush', 'dirt', 'fence', 'stripe', 'rock', 'animal', 'group', 'background', 'puddle', 'cow', '[UNK]', 'grassy', 'road', 'couple', 'herd', 'wood', 'other', 'patch', 'next', 'pole', 'area', 'stick', 'grazing', 'open', 'mud', 'green', 'wall', 'nose', 'path', 'sign', 'bird', 'person', 'car', 'line', 'building', 'water', 'horn', 'side'] 2022-03-17 08:03:30,372.372 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['other', 'next', 'road', 'car', 'field', 'ground', 'date', 'couple', 'tree', 'wood', 'sky', 'leg', 'ear', 'grass', 'tail', 'bush', 'cloud', 'suv', 'mane', 'zebra'] 2022-03-17 08:05:53,855.855 2829:trainer.py:487 do_train_dict(): eta: 4:57:58 iter: 56200 speed: 272.2 images/sec total_norm: 149.7986 (152.3104) loss: 137.5797 (138.6150) masked_loss: 1.3879 (1.4413) tag_loss: 135.9902 (137.1737) time: 1.4339 (1.8808) data: 0.0001 (0.0002) to_device: 0.0052 (0.0051) time_gpu: 1.4285 (1.8756) save_time: 8.8421 (14.7395) lr: 0.000015 max mem: 26307 2022-03-17 08:05:54,215.215 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6764705777168274 2022-03-17 08:05:54,215.215 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 140.3431396484375 2022-03-17 08:05:54,215.215 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.59326526242081 2022-03-17 08:06:22,283.283 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023203222081065178 2022-03-17 08:06:22,283.283 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 08:06:22,283.283 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'table', 'has', '[MASK]', '[MASK]', "'", 's', 'items', 'on', 'it', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 08:06:22,299.299 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'logo', 'wall', 'toy', 'hat', 'picture', 'box', 'man', 'book', 'table', 'window', 'car', 'sign', 'shirt', 'doll', 'helmet', 'tire', 'bear', 'floor', 'leg', 'light', 'person', 'poster', 'hair', 'bag', 'shelf', 'head', 'truck', 'door', 'figure', 'clothes', 'jacket', 'woman', 'wheel', 'ground', 'chair', 'suit', 'tree', 'store', 'hand', 'display', 'boot', 'uniform', 'desk', 'coat', 'pole', 'shoe', 'building', 'board', 'arm'] 2022-03-17 08:06:38,206.206 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'book', 'board', 'table', 'wall', 'magazine', 'figure', 'box', 'block', 'picture', 'boat', 'bottle', 'hat', 'logo', 'toy', 'candy', 'miscellaneous'] 2022-03-17 08:09:02,346.346 2829:trainer.py:487 do_train_dict(): eta: 4:55:05 iter: 56300 speed: 271.6 images/sec total_norm: 149.8941 (152.4911) loss: 139.7734 (140.5896) masked_loss: 1.4047 (1.4255) tag_loss: 137.9306 (139.1640) time: 1.4341 (1.8849) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4290 (1.8798) save_time: 8.8421 (14.7395) lr: 0.000015 max mem: 26307 2022-03-17 08:09:02,707.707 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5 2022-03-17 08:09:02,707.707 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 151.23934936523438 2022-03-17 08:09:02,707.707 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.5877472559611 2022-03-17 08:09:31,205.205 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023203494027256966 2022-03-17 08:09:31,206.206 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 08:09:31,206.206 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'piece', 'of', 'bro', '##cco', '##li', 'on', '[MASK]', 'metal', 'fork', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 08:09:31,222.222 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['fork', '[UNK]', 'stem', 'table', 'background', 'plate', 'wall', 'close', 'food', 'apple', 'bowl', 'piece', 'spot', 'head', 'green', 'leaf', 'skin', 'object', 'handle', 'ring', 'banana', 'onion', 'fruit', 'surface', 'small', 'hole', 'water', 'rim', 'orange', 'vegetable', 'flower', 'white', 'yellow', 'ground', 'shadow', 'end', 'peel', 'line', 'other', 'top', 'band', 'next', 'side', 'reflection', 'image', 'plant', 'light', 'picture', 'body', 'full'] 2022-03-17 08:09:47,120.120 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'table', 'food', 'metal', 'piece', 'plate', 'flower', 'stem', 'fork', 'banana'] 2022-03-17 08:12:10,910.910 2829:trainer.py:487 do_train_dict(): eta: 4:52:13 iter: 56400 speed: 271.5 images/sec total_norm: 149.1908 (151.0065) loss: 141.2649 (141.3576) masked_loss: 1.4030 (1.4457) tag_loss: 139.2193 (139.9120) time: 1.4326 (1.8857) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4275 (1.8806) save_time: 8.8421 (14.7395) lr: 0.000015 max mem: 26307 2022-03-17 08:12:11,271.271 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.47058823704719543 2022-03-17 08:12:11,271.271 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 152.55862426757812 2022-03-17 08:12:11,272.272 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.58925298505125 2022-03-17 08:12:39,672.672 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023209817707538605 2022-03-17 08:12:39,672.672 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 08:12:39,672.672 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'in', 'a', 'red', 'tie', 'and', '[MASK]', 'is', 'holding', 'a', 'large', 'fish', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 08:12:39,688.688 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['eye', 'tie', 'ear', 'nose', 'head', 'face', 'shirt', 'man', 'jacket', 'hair', 'mouth', 'neck', 'collar', 'tree', 'smile', 'chin', 'knot', 'hand', 'coat', 'teeth', 'arm', 'eyebrow', '[UNK]', 'background', 'finger', 'shoulder', 'sky', 'grass', 'suit', 'button', 'car', 'young', 'person', 'hat', 'boy', 'forehead', 'bush', 'hood', 'strap', 'lip', 'sleeve', 'field', 'camera', 'ground', 'ring', 'woman', 'wall', 'watch', 'fence', 'blue'] 2022-03-17 08:12:55,702.702 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'hand', 'face', 'water', 'large', 'red', 'mouth', 'smile', 'eye', 'neck', 'tree', 'sky', 'shirt', 'fish', 'animal', 'background', 'finger', 'nose', 'ear', 'suit', 'chin', 'tie', 'hat', 'cap', 'jacket', 'collar'] 03-17 08:13:23.725 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-17 08:13:23.725 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-17 08:13:24.987 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-17 08:15:19,303.303 2829:trainer.py:487 do_train_dict(): eta: 4:49:20 iter: 56500 speed: 271.8 images/sec total_norm: 148.4305 (149.5362) loss: 138.8120 (138.3451) masked_loss: 1.4183 (1.4488) tag_loss: 137.2401 (136.8963) time: 1.4319 (1.8839) data: 0.0001 (0.0005) to_device: 0.0051 (0.0051) time_gpu: 1.4266 (1.8783) save_time: 8.8421 (14.7395) lr: 0.000015 max mem: 26307 2022-03-17 08:15:19,664.664 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6388888955116272 2022-03-17 08:15:19,664.664 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 150.31787109375 2022-03-17 08:15:19,664.664 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.59516465284798 2022-03-17 08:15:48,079.079 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023205023258924484 2022-03-17 08:15:48,079.079 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 08:15:48,079.079 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'are', 'many', 'pedestrians', 'and', 'cyclists', 'along', 'this', 'small', 'street', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 08:15:48,095.095 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['bike', '[UNK]', 'bicycle', 'shoe', 'man', 'line', 'jacket', 'street', 'road', 'person', 'glove', 'head', 'building', 'sidewalk', 'light', 'woman', 'hat', 'wheel', 'tire', 'sign', 'helmet', 'bag', 'hand', 'coat', 'shirt', 'car', 'leg', 'face', 'motorcycle', 'window', 'curb', 'jean', 'hair', 'pole', 'background', 'license', 'backpack', 'tree', 'umbrella', 'traffic', 'sky', 'boot', 'bus', 'van', 'vehicle', 'arm', 'glasses', 'basket', 'city', 'foot'] 2022-03-17 08:16:04,034.034 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'many', 'head', 'man', 'hand', 'small', 'line', 'building', 'road', 'street', 'woman', 'hair', 'person', 'phone', 'cell', 'sign', 'leg', 'bag', 'wheel', 'coat', 'jacket', 'bike', 'bicycle', 'shoe', 'sidewalk', 'tire', 'curb', 'glove'] 2022-03-17 08:18:27,874.874 2829:trainer.py:487 do_train_dict(): eta: 4:46:27 iter: 56600 speed: 271.5 images/sec total_norm: 147.9287 (151.0743) loss: 137.4375 (137.9743) masked_loss: 1.3778 (1.3839) tag_loss: 135.7999 (136.5904) time: 1.4342 (1.8858) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4291 (1.8806) save_time: 8.8421 (14.7395) lr: 0.000015 max mem: 26307 2022-03-17 08:18:28,233.233 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6060606241226196 2022-03-17 08:18:28,233.233 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 133.1803436279297 2022-03-17 08:18:28,233.233 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.60285644934922 2022-03-17 08:18:56,441.441 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023262659087777138 2022-03-17 08:18:56,441.441 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 08:18:56,441.441 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'person', '[MASK]', 'a', 'skate', '##board', 'on', 'a', 'street', 'doing', 'a', 'trick', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 08:18:56,457.457 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'helmet', '[UNK]', 'bush', 'arm', 'ground', 'head', 'hand', 'boy', 'leg', 'shoe', 'man', 'wheel', 'tree', 'face', 'pad', 'knee', 'grass', 'wall', 'glove', 'sock', 'background', 'pole', 'field', 'short', 'young', 'fence', 'park', 'foot', 'person', 'belt', 'jean', 'ear', 'watch', 'nose', 'building', 'ball', 'bracelet', 'dirt', 'line', 'hat', 'hair', 'door', 'wrist', 'stripe', 'sleeve', 'elbow', 'uniform', 'band', 'road'] 2022-03-17 08:19:12,369.369 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'door', 'road', 'street', 'light', 'car', 'ground', 'hair', 'person', 'arm', 'boy', 'plant', 'tree', 'watch', 'sky', 'shirt', 'nose', 'wheel', 'grass', 'bush', 'pole', 'wrist', 'trick', 'fence', 'helmet', 'shoe', 'sidewalk', 'curb', 'sweater', 'glove'] 2022-03-17 08:21:36,293.293 2829:trainer.py:487 do_train_dict(): eta: 4:43:35 iter: 56700 speed: 271.7 images/sec total_norm: 149.2894 (150.7991) loss: 140.6784 (141.9022) masked_loss: 1.3397 (1.3747) tag_loss: 139.3652 (140.5275) time: 1.4313 (1.8841) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4262 (1.8789) save_time: 8.8421 (14.7395) lr: 0.000015 max mem: 26307 2022-03-17 08:21:36,654.654 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6969696879386902 2022-03-17 08:21:36,654.654 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 112.25345611572266 2022-03-17 08:21:36,654.654 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.60558924204867 2022-03-17 08:22:05,047.047 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023266250267624855 2022-03-17 08:22:05,048.048 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 08:22:05,048.048 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'white', 'detailed', 'kitchen', '[MASK]', 'shown', 'with', 'wood', 'floor', '##ing', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 08:22:05,063.063 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['cabinet', 'door', 'floor', 'stove', '[UNK]', 'wall', 'kitchen', 'oven', 'refrigerator', 'handle', 'top', 'drawer', 'outlet', 'knob', 'white', 'window', 'black', 'sink', 'ceiling', 'tile', 'light', 'empty', 'room', 'fan', 'large', 'shelf', 'switch', 'microwave', 'wood', 'rack', 'bag', 'clean', 'open', 'small', 'wooden', 'cord', 'logo', 'new', 'cupboard', 'paper', 'old', 'leg', 'brown', 'range', 'counter', 'next', 'area', 'silver', 'hood', 'modern'] 2022-03-17 08:22:20,979.979 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'white', 'top', 'door', 'floor', 'wall', 'wood', 'kitchen', 'handle', 'cabinet', 'detailed', 'sink', 'drawer', 'outlet', 'stove', 'oven', 'refrigerator'] 2022-03-17 08:24:44,865.865 2829:trainer.py:487 do_train_dict(): eta: 4:40:42 iter: 56800 speed: 271.5 images/sec total_norm: 148.0666 (149.1261) loss: 136.3703 (136.7830) masked_loss: 1.4991 (1.4966) tag_loss: 134.9193 (135.2864) time: 1.4320 (1.8858) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4269 (1.8806) save_time: 8.8421 (14.7395) lr: 0.000014 max mem: 26307 2022-03-17 08:24:45,225.225 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-17 08:24:45,225.225 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 107.49156188964844 2022-03-17 08:24:45,226.226 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.61769520335541 2022-03-17 08:25:13,688.688 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023308832198381424 2022-03-17 08:25:13,689.689 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 08:25:13,689.689 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'woman', '[MASK]', 'in', 'a', 'kitchen', 'holding', 'a', '[MASK]', '##brush', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 08:25:13,705.705 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'cabinet', 'kitchen', 'bowl', 'woman', 'curtain', 'hand', 'hair', 'glass', 'apron', 'shirt', 'bottle', 'table', 'window', 'wall', 'dress', 'container', 'shelf', 'pot', 'head', 'drawer', 'towel', 'cup', 'food', 'ear', 'girl', 'door', 'spoon', 'eye', 'refrigerator', 'face', 'sink', 'man', 'paper', 'nose', 'knife', 'person', 'plate', 'outlet', 'napkin', 'stove', 'handle', 'basket', 'top', 'counter', 'box', 'knob', 'fish', 'vase', 'arm'] 2022-03-17 08:25:29,578.578 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'hand', 'face', 'book', 'woman', 'hair', 'wall', 'glass', 'paper', 'window', 'shirt', 'kitchen', 'fish', 'dress', 'ear', 'bowl', 'cabinet', 'knife', 'bottle', 'sink', 'pot', 'towel', 'basket', 'curtain', 'stove', 'microwave', 'apron'] 2022-03-17 08:27:53,510.510 2829:trainer.py:487 do_train_dict(): eta: 4:37:49 iter: 56900 speed: 271.4 images/sec total_norm: 149.3862 (152.0056) loss: 140.9772 (139.8971) masked_loss: 1.4529 (1.4219) tag_loss: 139.6408 (138.4752) time: 1.4308 (1.8865) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4257 (1.8813) save_time: 8.8421 (14.7395) lr: 0.000014 max mem: 26307 2022-03-17 08:27:53,871.871 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6000000238418579 2022-03-17 08:27:53,872.872 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 133.73890686035156 2022-03-17 08:27:53,872.872 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.62367358960603 2022-03-17 08:28:22,705.705 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023288562893867493 2022-03-17 08:28:22,705.705 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 08:28:22,706.706 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'beige', 'and', 'red', 'and', 'a', 'blue', 'and', '[MASK]', 'and', 'white', 'train', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 08:28:22,721.721 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['window', 'sky', 'train', 'tree', 'track', 'pole', 'car', 'building', 'roof', 'tower', 'station', 'platform', 'door', 'sidewalk', 'ground', 'line', 'stripe', 'wire', 'red', 'bridge', 'sign', 'water', 'wheel', 'passenger', 'flag', 'fence', 'light', '[UNK]', 'wall', 'long', 'street', 'grass', 'next', 'person', 'white', 'logo', 'top', 'front', 'power', 'bench', 'gravel', 'chimney', 'bush', 'road', 'pavement', 'commuter', 'pillar', 'stop', 'railroad', 'other'] 2022-03-17 08:28:38,739.739 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['station', 'building', 'white', 'door', 'red', 'light', 'car', 'blue', 'track', 'green', 'window', 'train', 'tree', 'tower', 'sky', 'platform', 'roof', 'pole', 'bike', 'fence', 'stripe', 'beige'] 2022-03-17 08:31:02,075.075 2829:trainer.py:487 do_train_dict(): eta: 4:34:56 iter: 57000 speed: 271.5 images/sec total_norm: 146.9477 (150.6722) loss: 133.0166 (135.5403) masked_loss: 1.3881 (1.4403) tag_loss: 131.9956 (134.1000) time: 1.4313 (1.8856) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4262 (1.8805) save_time: 8.8421 (14.7395) lr: 0.000014 max mem: 26307 2022-03-17 08:31:02,437.437 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6470588445663452 2022-03-17 08:31:02,437.437 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 182.251953125 2022-03-17 08:31:02,438.438 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.62265684725733 2022-03-17 08:31:31,150.150 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023288192227482796 2022-03-17 08:31:31,150.150 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 08:31:31,151.151 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'standing', 'on', 'a', '[MASK]', '[MASK]', 'holding', 'a', 'tennis', 'ra', '##c', '##quet', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 08:31:31,166.166 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'line', 'hand', '[UNK]', 'wall', 'shoe', 'man', 'tennis', 'court', 'arm', 'hair', 'leg', 'fence', 'jacket', 'building', 'shadow', 'head', 'ball', 'mouth', 'tree', 'handle', 'ground', 'face', 'pole', 'zipper', 'string', 'palm', 'wire', 'person', 'air', 'light', 'jean', 'stripe', 'logo', 'background', 'clothes', 'window', 'sign', 'shirt', 'house', 'writing', 'roof', 'street', 'mountain', 'guy', 'player', 'hill', 'cloud', 'net', 'antenna'] 2022-03-17 08:31:47,133.133 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'line', 'building', 'court', 'ground', 'hair', 'mouth', 'wall', 'arm', 'ball', 'sky', 'leg', 'handle', 'tennis', 'string', 'shadow', 'jacket', 'wire', 'fence', 'shoe'] 2022-03-17 08:34:10,873.873 2829:trainer.py:487 do_train_dict(): eta: 4:32:03 iter: 57100 speed: 271.2 images/sec total_norm: 148.8043 (150.7363) loss: 136.7576 (137.6475) masked_loss: 1.4386 (1.4457) tag_loss: 135.4401 (136.2019) time: 1.4320 (1.8880) data: 0.0001 (0.0002) to_device: 0.0052 (0.0051) time_gpu: 1.4267 (1.8828) save_time: 8.8421 (14.7395) lr: 0.000014 max mem: 26307 2022-03-17 08:34:11,235.235 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6470588445663452 2022-03-17 08:34:11,235.235 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 141.95419311523438 2022-03-17 08:34:11,236.236 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.62324387210232 2022-03-17 08:34:40,029.029 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02328598126769066 2022-03-17 08:34:40,029.029 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 08:34:40,030.030 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'three', 'kids', 'sitting', 'on', 'a', 'couch', 'playing', 'a', '[MASK]', 'game', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 08:34:40,045.045 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hair', 'pillow', 'boy', 'head', 'shirt', 'hand', 'couch', 'eye', '[UNK]', 'face', 'remote', 'ear', 'cushion', 'wall', 'nose', 'jean', 'arm', 'table', 'smile', 'leg', 'controller', 'control', 'chair', 'mouth', 'sweater', 'book', 'young', 'blanket', 'window', 'woman', 'laptop', 'paper', 'room', 'phone', 'dog', 'red', 'game', 'foot', 'cord', 'floor', 'kid', 'video', 'man', 'sleeve', 'cat', 'sofa', 'curtain', 'logo', 'picture', 'sock'] 2022-03-17 08:34:55,914.914 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'hand', 'game', 'woman', 'hair', 'girl', 'video', 'person', 'wall', 'boy', 'phone', 'jean', 'shirt', 'ear', 'hole', 'couch', 'pole', 'remote', 'glasses', 'logo', 'blanket', 'pillow', 'rug'] 2022-03-17 08:37:19,555.555 2829:trainer.py:487 do_train_dict(): eta: 4:29:10 iter: 57200 speed: 271.4 images/sec total_norm: 148.8788 (151.3214) loss: 137.1562 (137.7977) masked_loss: 1.4398 (1.4618) tag_loss: 135.5013 (136.3359) time: 1.4313 (1.8868) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4262 (1.8817) save_time: 8.8421 (14.7395) lr: 0.000014 max mem: 26307 2022-03-17 08:37:19,915.915 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6060606241226196 2022-03-17 08:37:19,915.915 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 117.8505859375 2022-03-17 08:37:19,915.915 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.6289490105594 2022-03-17 08:37:48,415.415 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023275790736079216 2022-03-17 08:37:48,415.415 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 08:37:48,416.416 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'box', 'of', 'six', 'don', '[MASK]', 'with', '[MASK]', '##eb', '##ora', '##te', 'decorations', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 08:37:48,431.431 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'box', 'hole', 'line', 'chocolate', 'table', 'different', 'paper', 'dozen', 'lid', 'food', 'pastry', 'top', 'light', 'cheese', 'cream', 'ball', 'wall', 'cake', 'candy', 'bunch', 'container', 'tile', 'yellow', 'open', 'sugar', 'variety', 'potato', 'full', 'group', 'reflection', 'dessert', 'cardboard', 'design', 'piece', 'bowl', 'stripe', 'orange', 'white', 'other', 'various', 'tray', 'large', 'several', 'kind', 'half', 'twelve', 'butter', 'glazed', 'close'] 2022-03-17 08:38:04,348.348 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'line', 'paper', 'box', 'hole'] 2022-03-17 08:40:28,435.435 2829:trainer.py:487 do_train_dict(): eta: 4:26:17 iter: 57300 speed: 271.1 images/sec total_norm: 148.6674 (151.1573) loss: 140.6454 (140.7898) masked_loss: 1.4309 (1.4318) tag_loss: 139.1984 (139.3580) time: 1.4306 (1.8888) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4256 (1.8838) save_time: 8.8421 (14.7395) lr: 0.000014 max mem: 26307 2022-03-17 08:40:28,796.796 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6060606241226196 2022-03-17 08:40:28,797.797 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 141.8095703125 2022-03-17 08:40:28,797.797 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.6352293100922 2022-03-17 08:40:57,641.641 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023335572332143784 2022-03-17 08:40:57,641.641 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 08:40:57,641.641 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'polar', 'bear', 'under', 'water', '[MASK]', 'with', 'a', '[MASK]', 'ball', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 08:40:57,657.657 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['bear', 'polar', 'head', 'paw', 'nose', 'eye', 'water', 'ball', 'ear', 'leg', 'rock', 'claw', 'fur', 'egg', 'foot', 'mouth', 'white', 'face', 'ice', 'reflection', '[UNK]', 'snout', 'large', 'leaf', 'bubble', 'snow', 'ledge', 'animal', 'wall', 'ground', 'back', 'tail', 'fish', 'blue', 'food', 'small', 'light', 'bowl', 'underwater', 'object', 'hole', 'grass', 'branch', 'pool', 'container', 'plant', 'other', 'handle', 'close', 'brown'] 2022-03-17 08:41:13,570.570 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'water', 'white', 'playing', 'mouth', 'eye', 'ice', 'foot', 'ball', 'leg', 'nose', 'ear', 'bear', 'polar', 'paw'] 03-17 08:43:25.088 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-17 08:43:25.088 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-17 08:43:26.191 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-17 08:43:37,096.096 2829:trainer.py:487 do_train_dict(): eta: 4:23:23 iter: 57400 speed: 271.4 images/sec total_norm: 150.4902 (155.1706) loss: 139.3521 (139.5555) masked_loss: 1.3812 (1.4023) tag_loss: 138.2511 (138.1532) time: 1.4316 (1.8866) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4264 (1.8814) save_time: 8.8421 (14.7395) lr: 0.000014 max mem: 26307 2022-03-17 08:43:37,458.458 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6875 2022-03-17 08:43:37,459.459 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 122.96945190429688 2022-03-17 08:43:37,459.459 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.64571930595066 2022-03-17 08:44:06,099.099 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023359859362244606 2022-03-17 08:44:06,099.099 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 08:44:06,100.100 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'living', 'room', 'with', 'a', '[MASK]', 'sectional', 'sofa', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 08:44:06,115.115 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['floor', 'wall', 'room', 'couch', 'stair', 'ceiling', 'railing', 'television', 'ottoman', 'staircase', 'door', 'pillow', 'bike', 'bicycle', 'blanket', 'living', 'light', 'table', 'book', 'sofa', 'flower', 'tire', 'wheel', 'stand', 'chair', 'basket', 'bag', '[UNK]', 'step', 'decoration', 'stairway', 'refrigerator', 'magazine', 'lamp', 'cabinet', 'arm', 'cushion', 'map', 'microwave', 'doorway', 'rail', 'fan', 'apartment', 'house', 'kitchen', 'box', 'top', 'vase', 'rug', 'white'] 2022-03-17 08:44:22,087.087 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['long', 'room', 'book', 'door', 'light', 'living', 'television', 'floor', 'table', 'wall', 'magazine', 'stand', 'window', 'step', 'box', 'coffee', 'ceiling', 'ottoman', 'couch', 'flower', 'bike', 'blanket', 'pillow', 'bicycle', 'lamp', 'sofa', 'staircase', 'curtain', 'railing', 'vase', 'stairway', 'stair', 'bouquet', 'sectional'] 2022-03-17 08:46:46,064.064 2829:trainer.py:487 do_train_dict(): eta: 4:20:30 iter: 57500 speed: 270.9 images/sec total_norm: 147.8803 (153.1434) loss: 137.8856 (139.5840) masked_loss: 1.4182 (1.4364) tag_loss: 136.5273 (138.1477) time: 1.4342 (1.8896) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4289 (1.8844) save_time: 8.8421 (14.7395) lr: 0.000013 max mem: 26307 2022-03-17 08:46:46,424.424 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6470588445663452 2022-03-17 08:46:46,425.425 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 166.89979553222656 2022-03-17 08:46:46,425.425 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.64459221230612 2022-03-17 08:47:14,986.986 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023396525532007217 2022-03-17 08:47:14,986.986 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 08:47:14,987.987 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'riding', 'a', '[MASK]', 'board', 'at', 'a', 'skate', 'park', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 08:47:15,002.002 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'wall', 'ramp', 'shirt', 'man', 'wheel', 'hand', 'head', 'arm', 'shoe', 'leg', 'shadow', 'floor', 'boy', 'short', 'light', 'skate', 'board', 'tile', 'hat', 'ceiling', 'hair', 'person', 'helmet', 'jean', 'building', 'graffiti', 'park', 'wire', 'pad', 'logo', 'ground', 'pole', 'air', 'sign', 'sky', 'foot', 'trick', 'knee', 'bowl', 'tree', 'door', 'picture', 'face', 'cap', 'fence', 'wood', 'line', 'box', 'sock'] 2022-03-17 08:47:30,945.945 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'hand', 'number', 'face', 'park', 'short', 'board', 'hair', 'person', 'floor', 'wall', 'arm', 'guitar', 'phone', 'shirt', 'leg', 'drawing', 'fence', 'shoe', 'ramp', 'tile', 'skate', 'graffiti'] 2022-03-17 08:49:54,751.751 2829:trainer.py:487 do_train_dict(): eta: 4:17:37 iter: 57600 speed: 271.4 images/sec total_norm: 147.6853 (149.3448) loss: 136.7640 (140.0522) masked_loss: 1.3705 (1.4063) tag_loss: 135.5707 (138.6459) time: 1.4321 (1.8869) data: 0.0001 (0.0005) to_device: 0.0051 (0.0050) time_gpu: 1.4270 (1.8814) save_time: 8.8421 (14.7395) lr: 0.000013 max mem: 26307 2022-03-17 08:49:55,111.111 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6571428775787354 2022-03-17 08:49:55,111.111 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 131.82594299316406 2022-03-17 08:49:55,112.112 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.65356919339872 2022-03-17 08:50:24,002.002 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023407846689224243 2022-03-17 08:50:24,002.002 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 08:50:24,002.002 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'bunch', 'of', '[MASK]', 'kind', 'of', 'surf', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 08:50:24,018.018 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sign', 'wall', 'ground', 'board', '[UNK]', 'pole', 'rope', 'post', 'base', 'stand', 'floor', 'shelf', 'rack', 'cord', 'grass', 'hook', 'display', 'box', 'chain', 'tag', 'door', 'leg', 'next', 'window', 'platform', 'wire', 'sky', 'tent', 'other', 'line', 'writing', 'ladder', 'orange', 'tree', 'banner', 'basket', 'art', 'table', 'sale', 'row', 'leaf', 'fence', 'object', 'group', 'shoe', 'white', 'net', 'beam', 'bunch', 'dirt'] 2022-03-17 08:50:39,919.919 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'different', 'ground', 'board', 'post', 'kind', 'wall', 'base', 'letter', 'sign', 'pole', 'beam', 'rope', 'bunch', 'banner', 'shelf', 'cord', 'bucket', 'surf', 'rack'] 2022-03-17 08:53:03,706.706 2829:trainer.py:487 do_train_dict(): eta: 4:14:44 iter: 57700 speed: 271.0 images/sec total_norm: 148.0219 (151.7331) loss: 138.9523 (140.3066) masked_loss: 1.4463 (1.4518) tag_loss: 137.7264 (138.8549) time: 1.4325 (1.8895) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4275 (1.8844) save_time: 8.8421 (14.7395) lr: 0.000013 max mem: 26307 2022-03-17 08:53:04,067.067 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6486486196517944 2022-03-17 08:53:04,067.067 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 150.50381469726562 2022-03-17 08:53:04,067.067 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.6492380254409 2022-03-17 08:53:32,915.915 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.0234339889138937 2022-03-17 08:53:32,916.916 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 08:53:32,916.916 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'long', 'dining', 'room', 'table', 'filled', 'with', 'people', 'in', 'dress', 'clothing', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 08:53:32,933.933 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['glass', 'table', 'man', 'shirt', 'person', 'hair', 'paper', 'ceiling', 'glasses', 'wall', 'light', 'head', 'bar', 'wine', '[UNK]', 'menu', 'restaurant', 'plate', 'woman', 'room', 'bottle', 'picture', 'hand', 'chair', 'napkin', 'window', 'group', 'door', 'lamp', 'cup', 'hat', 'mirror', 'jacket', 'sign', 'face', 'camera', 'long', 'ear', 'large', 'column', 'doorway', 'speaker', 'book', 'frame', 'water', 'phone', 'pitcher', 'beam', 'vent', 'shelf'] 2022-03-17 08:53:48,781.781 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'room', 'door', 'light', 'woman', 'hair', 'person', 'table', 'wall', 'glass', 'paper', 'bar', 'sign', 'shirt', 'picture', 'dress', 'bowl', 'handle', 'plate', 'bottle', 'ceiling', 'clothing', 'glasses', 'pitcher', 'lamp', 'menu', 'candle', 'lemon', 'napkin', 'receipt'] 2022-03-17 08:56:12,573.573 2829:trainer.py:487 do_train_dict(): eta: 4:11:50 iter: 57800 speed: 271.1 images/sec total_norm: 148.7063 (150.6022) loss: 135.1004 (137.7678) masked_loss: 1.3843 (1.4452) tag_loss: 133.6597 (136.3226) time: 1.4319 (1.8887) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4270 (1.8835) save_time: 8.8421 (14.7395) lr: 0.000013 max mem: 26307 2022-03-17 08:56:12,934.934 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6969696879386902 2022-03-17 08:56:12,934.934 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 118.05091094970703 2022-03-17 08:56:12,934.934 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.65965647195081 2022-03-17 08:56:41,607.607 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02344072423875332 2022-03-17 08:56:41,607.607 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 08:56:41,608.608 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'blue', '[MASK]', 'sitting', 'next', 'to', 'a', 'green', 'tree', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 08:56:41,623.623 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['vase', 'wall', 'floor', 'flower', 'plant', 'table', 'base', 'stand', 'leaf', 'shadow', '[UNK]', 'paper', 'book', 'blue', 'design', 'tree', 'pot', 'display', 'star', 'top', 'box', 'reflection', 'room', 'picture', 'next', 'shelf', 'frame', 'light', 'outlet', 'white', 'platform', 'large', 'window', 'leg', 'pedestal', 'front', 'wooden', 'tile', 'handle', 'small', 'colorful', 'painting', 'rug', 'man', 'head', 'corner', 'chair', 'green', 'hair', 'wood'] 2022-03-17 08:56:57,585.585 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['next', 'top', 'book', 'blue', 'green', 'floor', 'table', 'wall', 'stand', 'paper', 'plant', 'tree', 'box', 'wood', 'tall', 'flower', 'leaf', 'cloth', 'pot', 'vase'] 2022-03-17 08:59:21,647.647 2829:trainer.py:487 do_train_dict(): eta: 4:08:57 iter: 57900 speed: 270.8 images/sec total_norm: 149.0412 (150.6454) loss: 139.1870 (139.5264) masked_loss: 1.3890 (1.4332) tag_loss: 137.9378 (138.0931) time: 1.4333 (1.8907) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4281 (1.8855) save_time: 8.8421 (14.7395) lr: 0.000013 max mem: 26307 2022-03-17 08:59:22,008.008 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5428571701049805 2022-03-17 08:59:22,009.009 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 114.33843231201172 2022-03-17 08:59:22,009.009 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.67407786270668 2022-03-17 08:59:50,967.967 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023425210267305374 2022-03-17 08:59:50,968.968 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 08:59:50,968.968 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'swinging', 'a', 'baseball', '[MASK]', '[MASK]', 'standing', 'on', 'a', 'field', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 08:59:50,983.983 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'helmet', 'grass', 'glove', 'jersey', 'tree', 'field', 'dirt', 'man', 'arm', 'baseball', '[UNK]', 'bat', 'sky', 'name', 'building', 'player', 'number', 'ball', 'uniform', 'head', 'fence', 'background', 'belt', 'catcher', 'hand', 'cloud', 'shoe', 'back', 'pole', 'ground', 'leg', 'person', 'hat', 'cap', 'roof', 'batter', 'game', 'home', 'handle', 'umpire', 'boy', 'pitcher', 'logo', 'stripe', 'red', 'short', 'base', 'band', 'house'] 2022-03-17 09:00:06,984.984 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'name', 'number', 'building', 'player', 'field', 'ground', 'person', 'arm', 'tree', 'baseball', 'ball', 'sky', 'shirt', 'jersey', 'grass', 'hat', 'cap', 'uniform', 'pole', 'dirt', 'bat', 'fence', 'helmet', 'glove'] 2022-03-17 09:02:30,685.685 2829:trainer.py:487 do_train_dict(): eta: 4:06:03 iter: 58000 speed: 270.8 images/sec total_norm: 148.7249 (150.2698) loss: 140.1587 (140.1680) masked_loss: 1.4372 (1.4344) tag_loss: 138.7953 (138.7336) time: 1.4332 (1.8904) data: 0.0001 (0.0001) to_device: 0.0052 (0.0051) time_gpu: 1.4279 (1.8852) save_time: 8.8421 (14.7395) lr: 0.000013 max mem: 26307 2022-03-17 09:02:31,047.047 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7428571581840515 2022-03-17 09:02:31,047.047 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 118.852783203125 2022-03-17 09:02:31,047.047 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.68301808526307 2022-03-17 09:03:00,365.365 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023407941684126854 2022-03-17 09:03:00,365.365 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 09:03:00,366.366 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'gi', '##rra', '##ffe', 'standing', '[MASK]', 'to', 'some', 'rocks', 'and', 'trees', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 09:03:00,381.381 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['head', 'ear', '[UNK]', 'rock', 'horn', 'eye', 'neck', 'spot', 'mane', 'zoo', 'tree', 'mouth', 'bush', 'nose', 'shadow', 'wall', 'branch', 'face', 'ground', 'plant', 'boulder', 'next', 'leg', 'grass', 'hair', 'trunk', 'tongue', 'leaf', 'stone', 'pole', 'standing', 'chin', 'enclosure', 'arm', 'weed', 'young', 'body', 'fence', 'vine', 'dirt', 'hat', 'front', 'other', 'small', 'top', 'ivy', 'animal', 'post', 'large', 'tall'] 2022-03-17 09:03:16,253.253 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'next', 'rock', 'mouth', 'eye', 'plant', 'neck', 'tree', 'spot', 'nose', 'ear', 'shadow', 'bush', 'flower', 'leaf', 'horn', 'ivy', 'zoo', 'mane'] 2022-03-17 09:05:39,692.692 2829:trainer.py:487 do_train_dict(): eta: 4:03:10 iter: 58100 speed: 270.9 images/sec total_norm: 149.6495 (152.0553) loss: 140.6216 (139.8002) masked_loss: 1.3836 (1.4795) tag_loss: 138.8303 (138.3206) time: 1.4317 (1.8900) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4266 (1.8848) save_time: 8.8421 (14.7395) lr: 0.000013 max mem: 26307 2022-03-17 09:05:40,053.053 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6875 2022-03-17 09:05:40,053.053 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 123.70963287353516 2022-03-17 09:05:40,053.053 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.68773946729313 2022-03-17 09:06:10,752.752 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023405494168400764 2022-03-17 09:06:10,753.753 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 09:06:10,753.753 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'at', 'an', '[MASK]', 'market', 'standing', 'behind', 'boxes', '[MASK]', 'bananas', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 09:06:10,769.769 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['person', 'box', 'umbrella', 'man', 'building', 'hat', 'banana', 'shirt', 'sky', 'woman', 'table', '[UNK]', 'city', 'pole', 'crate', 'head', 'boot', 'ground', 'shoe', 'bag', 'jacket', 'floor', 'cap', 'stand', 'hair', 'jean', 'cart', 'light', 'hand', 'group', 'market', 'backpack', 'tree', 'wheel', 'background', 'sign', 'leg', 'bunch', 'street', 'short', 'yellow', 'glasses', 'chair', 'large', 'bottle', 'boy', 'sweater', 'sunglasses', 'sidewalk', 'road'] 2022-03-17 09:06:26,657.657 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'city', 'head', 'man', 'building', 'woman', 'ground', 'hair', 'person', 'table', 'market', 'box', 'sky', 'jean', 'shirt', 'leg', 'bag', 'flag', 'hat', 'pole', 'outdoor', 'banner', 'boot', 'shoe', 'umbrella', 'backpack', 'banana', 'crate'] 2022-03-17 09:08:50,328.328 2829:trainer.py:487 do_train_dict(): eta: 4:00:17 iter: 58200 speed: 268.6 images/sec total_norm: 148.6850 (150.6591) loss: 136.6240 (137.9547) masked_loss: 1.4319 (1.4384) tag_loss: 135.3579 (136.5163) time: 1.4333 (1.9064) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4282 (1.9012) save_time: 8.8421 (14.7395) lr: 0.000012 max mem: 26307 2022-03-17 09:08:50,687.687 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.529411792755127 2022-03-17 09:08:50,687.687 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 143.49464416503906 2022-03-17 09:08:50,687.687 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.6921262463002 2022-03-17 09:09:20,177.177 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023391637951135635 2022-03-17 09:09:20,178.178 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 09:09:20,178.178 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'group', '[MASK]', 'goats', 'sitting', 'on', '[MASK]', 'green', 'grass', 'beside', 'a', 'body', 'of', 'water', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 09:09:20,193.193 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'grass', 'gravel', 'rock', 'plant', 'ground', 'trunk', 'wood', 'field', 'forest', 'hill', 'animal', 'sheep', '[UNK]', 'pond', 'branch', 'bush', 'head', 'shadow', 'dirt', 'goat', 'cow', 'fence', 'grassy', 'leg', 'flower', 'hole', 'horse', 'water', 'donkey', 'stick', 'area', 'post', 'green', 'lush', 'leaf', 'lamb', 'grazing', 'group', 'roof', 'wall', 'herd', 'open', 'ear', 'white', 'pine', 'large', 'log', 'elephant', 'road'] 2022-03-17 09:09:36,061.061 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'group', 'water', 'body', 'ground', 'rock', 'green', 'hill', 'forest', 'plant', 'tree', 'wood', 'branch', 'shadow', 'grass', 'bush', 'dirt', 'trunk', 'sheep', 'gravel', 'cow', 'goat', 'lush'] 2022-03-17 09:11:59,523.523 2829:trainer.py:487 do_train_dict(): eta: 3:57:23 iter: 58300 speed: 270.6 images/sec total_norm: 149.2654 (152.6587) loss: 137.8673 (138.9792) masked_loss: 1.4236 (1.3977) tag_loss: 136.4411 (137.5815) time: 1.4326 (1.8920) data: 0.0002 (0.0002) to_device: 0.0049 (0.0048) time_gpu: 1.4276 (1.8870) save_time: 8.8421 (14.7395) lr: 0.000012 max mem: 26307 2022-03-17 09:11:59,883.883 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5757575631141663 2022-03-17 09:11:59,884.884 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 139.43861389160156 2022-03-17 09:11:59,884.884 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.69305378770176 2022-03-17 09:12:29,420.420 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023406347259879112 2022-03-17 09:12:29,420.420 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 09:12:29,421.421 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'very', 'small', 'and', '[MASK]', 'kitchen', 'sits', 'upwards', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 09:12:29,436.436 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['[UNK]', 'kitchen', 'cabinet', 'window', 'sink', 'handle', 'stove', 'towel', 'curtain', 'wall', 'microwave', 'door', 'refrigerator', 'ceiling', 'oven', 'bottle', 'light', 'drawer', 'floor', 'pot', 'container', 'cup', 'rack', 'bowl', 'paper', 'magnet', 'bag', 'basket', 'sponge', 'top', 'knob', 'knife', 'picture', 'shelf', 'clock', 'maker', 'plant', 'outlet', 'dish', 'vase', 'jar', 'cord', 'glove', 'mug', 'glass', 'kettle', 'plate', 'spoon', 'counter', 'green'] 2022-03-17 09:12:45,429.429 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'small', 'top', 'door', 'light', 'cup', 'wall', 'paper', 'window', 'kitchen', 'clean', 'handle', 'cabinet', 'bottle', 'ceiling', 'sink', 'pot', 'towel', 'curtain', 'shelf', 'container', 'drawer', 'mug', 'spoon', 'glove', 'stove', 'knob', 'oven', 'refrigerator', 'microwave'] 03-17 09:13:26.292 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-17 09:13:26.292 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-17 09:13:27.560 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-17 09:15:09,114.114 2829:trainer.py:487 do_train_dict(): eta: 3:54:29 iter: 58400 speed: 270.1 images/sec total_norm: 148.1431 (150.0670) loss: 137.8973 (138.9593) masked_loss: 1.3678 (1.4088) tag_loss: 136.2453 (137.5505) time: 1.4342 (1.8959) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4291 (1.8908) save_time: 8.8421 (14.7395) lr: 0.000012 max mem: 26307 2022-03-17 09:15:09,475.475 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6666666865348816 2022-03-17 09:15:09,475.475 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 107.7069091796875 2022-03-17 09:15:09,475.475 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.69959824390901 2022-03-17 09:15:38,681.681 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02340935356914997 2022-03-17 09:15:38,681.681 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 09:15:38,682.682 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'older', 'woman', 'walks', 'in', 'the', 'rain', 'with', '[MASK]', 'umbrella', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 09:15:38,697.697 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['umbrella', 'hand', 'sidewalk', 'jacket', 'woman', 'person', 'building', 'ground', 'rain', 'line', 'coat', 'shoe', 'bag', 'mouth', '[UNK]', 'purse', 'window', 'street', 'face', 'man', 'head', 'skirt', 'sign', 'leg', 'hair', 'dress', 'handle', 'pole', 'hat', 'reflection', 'foot', 'road', 'watch', 'light', 'jean', 'car', 'glasses', 'lady', 'curb', 'fence', 'tree', 'rainy', 'strap', 'door', 'tire', 'city', 'shirt', 'boot', 'wall', 'store'] 2022-03-17 09:15:54,589.589 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'hand', 'face', 'line', 'building', 'door', 'road', 'street', 'woman', 'car', 'ground', 'hair', 'mouth', 'person', 'foot', 'tree', 'watch', 'bus', 'leg', 'dress', 'bag', 'rain', 'truck', 'coat', 'jacket', 'fence', 'purse', 'skirt', 'shoe', 'sidewalk', 'umbrella'] 2022-03-17 09:18:18,547.547 2829:trainer.py:487 do_train_dict(): eta: 3:51:36 iter: 58500 speed: 270.3 images/sec total_norm: 149.4313 (151.9876) loss: 141.6673 (139.2598) masked_loss: 1.3650 (1.3889) tag_loss: 140.1983 (137.8709) time: 1.4327 (1.8943) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4275 (1.8891) save_time: 8.8421 (14.7395) lr: 0.000012 max mem: 26307 2022-03-17 09:18:18,910.910 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.8529411554336548 2022-03-17 09:18:18,910.910 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 125.83739471435547 2022-03-17 09:18:18,910.910 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.71227444723604 2022-03-17 09:18:48,264.264 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023428311571478844 2022-03-17 09:18:48,264.264 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 09:18:48,265.265 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'dog', 'that', 'is', 'laying', 'between', '[MASK]', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 09:18:48,280.280 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'jean', 'dog', 'hair', 'head', 'couch', 'boy', 'hand', 'sock', 'leg', 'remote', 'ear', 'person', 'floor', 'eye', 'blanket', 'pillow', 'wall', 'shoe', 'collar', '[UNK]', 'nose', 'face', 'carpet', 'mouth', 'child', 'paw', 'tail', 'control', 'bed', 'foot', 'woman', 'cushion', 'rug', 'chair', 'girl', 'glasses', 'arm', 'sweater', 'man', 'controller', 'sofa', 'kid', 'book', 'young', 'knee', 'door', 'table', 'baby', 'laptop'] 2022-03-17 09:19:04,246.246 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'hand', 'face', 'book', 'hair', 'girl', 'mouth', 'child', 'bed', 'table', 'wall', 'boy', 'magazine', 'eye', 'jean', 'shirt', 'dog', 'leg', 'ear', 'kid', 'tail', 'couch', 'mouse', 'keyboard', 'collar', 'pillow', 'shoe', 'cushion', 'sock'] 2022-03-17 09:21:27,904.904 2829:trainer.py:487 do_train_dict(): eta: 3:48:42 iter: 58600 speed: 270.4 images/sec total_norm: 150.2085 (153.8813) loss: 138.6582 (138.3724) masked_loss: 1.4537 (1.4222) tag_loss: 137.0674 (136.9502) time: 1.4316 (1.8936) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4264 (1.8883) save_time: 8.8421 (14.7395) lr: 0.000012 max mem: 26307 2022-03-17 09:21:28,265.265 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.625 2022-03-17 09:21:28,266.266 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 124.38931274414062 2022-03-17 09:21:28,266.266 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.72021060014299 2022-03-17 09:21:57,634.634 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02346744015812874 2022-03-17 09:21:57,634.634 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 09:21:57,634.634 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'men', 'in', 'a', 'batting', 'position', 'playing', 'baseball', 'in', '[MASK]', 'field', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 09:21:57,650.650 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['dirt', 'field', 'helmet', 'shoe', 'grass', '[UNK]', 'line', 'man', 'bat', 'catcher', 'uniform', 'shirt', 'glove', 'leg', 'plate', 'mask', 'batter', 'home', 'umpire', 'player', 'belt', 'baseball', 'jersey', 'head', 'ground', 'hand', 'number', 'fence', 'game', 'base', 'guard', 'wall', 'box', 'shin', 'arm', 'person', 'stand', 'hat', 'ready', 'camera', 'ball', 'cooler', 'pitch', 'face', 'name', 'swing', 'sock', 'stripe', 'chair', 'spectator'] 2022-03-17 09:22:13,615.615 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'name', 'home', 'line', 'player', 'field', 'position', 'ground', 'baseball', 'shirt', 'jersey', 'leg', 'plate', 'grass', 'belt', 'uniform', 'dirt', 'bat', 'mask', 'batting', 'helmet', 'shoe', 'catcher', 'glove', 'batter'] 2022-03-17 09:24:37,370.370 2829:trainer.py:487 do_train_dict(): eta: 3:45:48 iter: 58700 speed: 270.2 images/sec total_norm: 148.3143 (152.4183) loss: 136.0296 (137.7186) masked_loss: 1.4282 (1.4334) tag_loss: 134.8833 (136.2852) time: 1.4313 (1.8946) data: 0.0001 (0.0005) to_device: 0.0051 (0.0050) time_gpu: 1.4260 (1.8890) save_time: 8.8421 (14.7395) lr: 0.000012 max mem: 26307 2022-03-17 09:24:37,730.730 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.44117647409439087 2022-03-17 09:24:37,730.730 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 118.98245239257812 2022-03-17 09:24:37,730.730 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.72977999602857 2022-03-17 09:25:06,934.934 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02347312681376934 2022-03-17 09:25:06,934.934 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 09:25:06,935.935 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'zebra', '##s', 'in', 'their', 'pen', 'some', '[MASK]', 'a', 'fence', 'and', 'tree', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 09:25:06,950.950 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'trunk', 'zebra', 'ground', 'shadow', 'pole', 'post', 'leaf', 'leg', 'fence', '[UNK]', 'grass', 'trough', 'dirt', 'head', 'rock', 'box', 'wood', 'enclosure', 'board', 'tail', 'branch', 'stripe', 'zoo', 'ear', 'building', 'mane', 'food', 'log', 'shade', 'wooden', 'bench', 'hay', 'roof', 'wall', 'cart', 'structure', 'next', 'other', 'bush', 'area', 'sign', 'feeder', 'door', 'basket', 'bin', 'crate', 'group', 'couple', 'horn'] 2022-03-17 09:25:22,941.941 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'building', 'ground', 'post', 'food', 'tree', 'leg', 'gate', 'shadow', 'grass', 'pole', 'dirt', 'leaf', 'pen', 'trunk', 'fence', 'shade', 'zoo', 'trough', 'zebra'] 2022-03-17 09:27:47,126.126 2829:trainer.py:487 do_train_dict(): eta: 3:42:54 iter: 58800 speed: 269.8 images/sec total_norm: 148.7619 (152.6464) loss: 137.1708 (137.0621) masked_loss: 1.4380 (1.4223) tag_loss: 135.5548 (135.6398) time: 1.4323 (1.8976) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4271 (1.8925) save_time: 8.8421 (14.7395) lr: 0.000011 max mem: 26307 2022-03-17 09:27:47,488.488 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7142857313156128 2022-03-17 09:27:47,488.488 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 126.76809692382812 2022-03-17 09:27:47,488.488 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.73064879205312 2022-03-17 09:28:16,718.718 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023475831374526024 2022-03-17 09:28:16,718.718 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 09:28:16,719.719 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'bethany', 'clock', 'on', 'the', 'outside', 'of', 'a', '[MASK]', 'building', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 09:28:16,734.734 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['building', 'clock', 'wall', '[UNK]', 'sun', 'star', 'window', 'number', 'painting', 'statue', 'roof', 'wire', 'arch', 'archway', 'sky', 'circle', 'balcony', 'tower', 'hand', 'roman', 'pipe', 'large', 'sculpture', 'column', 'spire', 'lion', 'decoration', 'side', 'sign', 'light', 'design', 'pillar', 'art', 'wing', 'pole', 'cross', 'bird', 'brick', 'big', 'ceiling', 'city', 'railing', 'street', 'door', 'picture', 'reflection', 'tree', 'bridge', 'ornate', 'sword'] 2022-03-17 09:28:32,685.685 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['number', 'building', 'large', 'light', 'outside', 'star', 'wall', 'sun', 'window', 'sky', 'roof', 'circle', 'clock', 'concrete', 'statue', 'arch', 'balcony', 'archway'] 2022-03-17 09:30:56,671.671 2829:trainer.py:487 do_train_dict(): eta: 3:40:01 iter: 58900 speed: 270.1 images/sec total_norm: 149.1548 (153.4005) loss: 138.1637 (140.4498) masked_loss: 1.3448 (1.4004) tag_loss: 136.5350 (139.0494) time: 1.4325 (1.8954) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4271 (1.8902) save_time: 8.8421 (14.7395) lr: 0.000011 max mem: 26307 2022-03-17 09:30:57,032.032 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.44736841320991516 2022-03-17 09:30:57,032.032 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 148.7872314453125 2022-03-17 09:30:57,032.032 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.73392271268166 2022-03-17 09:31:26,801.801 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02353179268538952 2022-03-17 09:31:26,801.801 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 09:31:26,802.802 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'tile', 'floor', 'in', 'an', '[MASK]', 'kitchen', 'with', '[MASK]', 'doors', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 09:31:26,817.817 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['floor', 'door', 'plant', 'wall', 'window', '[UNK]', 'kitchen', 'table', 'tile', 'fence', 'cabinet', 'curtain', 'towel', 'tree', 'vase', 'leg', 'railing', 'ceiling', 'rack', 'chair', 'handle', 'bowl', 'switch', 'outlet', 'sink', 'rug', 'patio', 'dish', 'shelf', 'balcony', 'house', 'mat', 'pot', 'cloth', 'light', 'plate', 'room', 'refrigerator', 'board', 'oven', 'drawer', 'basket', 'top', 'shoe', 'counter', 'stove', 'cutting', 'picture', 'blind', 'flower'] 2022-03-17 09:31:42,835.835 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'house', 'open', 'door', 'floor', 'table', 'wall', 'plant', 'window', 'tree', 'kitchen', 'picture', 'leg', 'bowl', 'handle', 'cabinet', 'ceiling', 'flower', 'switch', 'sink', 'cloth', 'fence', 'towel', 'curtain', 'balcony', 'mat', 'tile', 'rack', 'railing', 'microwave', 'vase', 'patio', 'rug'] 2022-03-17 09:34:06,073.073 2829:trainer.py:487 do_train_dict(): eta: 3:37:07 iter: 59000 speed: 270.3 images/sec total_norm: 149.2645 (151.1273) loss: 137.3948 (139.3826) masked_loss: 1.4110 (1.4411) tag_loss: 135.7160 (137.9415) time: 1.4325 (1.8941) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4271 (1.8888) save_time: 8.8421 (14.7395) lr: 0.000011 max mem: 26307 2022-03-17 09:34:06,435.435 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7428571581840515 2022-03-17 09:34:06,435.435 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 136.85707092285156 2022-03-17 09:34:06,435.435 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.73886231682224 2022-03-17 09:34:36,152.152 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02352909743785858 2022-03-17 09:34:36,152.152 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 09:34:36,152.152 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'four', 'tennis', 'players', 'are', 'competing', 'on', 'a', '[MASK]', 'court', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 09:34:36,168.168 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'court', 'net', 'tennis', '[UNK]', 'short', 'man', 'line', 'hair', 'shoe', 'fence', 'person', 'leg', 'shadow', 'hand', 'player', 'head', 'pole', 'woman', 'arm', 'hat', 'sign', 'outfit', 'tree', 'wall', 'chair', 'girl', 'ball', 'boy', 'sky', 'sock', 'match', 'uniform', 'skirt', 'dress', 'couple', 'top', 'stand', 'cloud', 'light', 'bag', 'game', 'car', 'playing', 'ground', 'building', 'cap', 'background', 'group', 'house'] 2022-03-17 09:34:52,130.130 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'hand', 'line', 'woman', 'court', 'short', 'hair', 'girl', 'arm', 'shirt', 'leg', 'dress', 'tennis', 'shadow', 'net', 'fence', 'shoe', 'outfit'] 2022-03-17 09:37:15,684.684 2829:trainer.py:487 do_train_dict(): eta: 3:34:13 iter: 59100 speed: 270.0 images/sec total_norm: 149.6712 (150.8104) loss: 141.0224 (139.9859) masked_loss: 1.3148 (1.4002) tag_loss: 139.3551 (138.5858) time: 1.4316 (1.8960) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4264 (1.8909) save_time: 8.8421 (14.7395) lr: 0.000011 max mem: 26307 2022-03-17 09:37:16,049.049 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7352941036224365 2022-03-17 09:37:16,050.050 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 140.25619506835938 2022-03-17 09:37:16,050.050 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.74285661207664 2022-03-17 09:37:45,809.809 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023554671555757523 2022-03-17 09:37:45,810.810 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 09:37:45,810.810 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'laptop', 'sits', 'on', 'the', 'edge', 'of', '##lio', 'counter', 'with', 'chairs', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 09:37:45,825.825 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['floor', 'laptop', 'table', 'couch', 'door', 'keyboard', 'room', 'stool', 'rug', 'screen', 'pillow', 'bag', 'tile', 'wall', 'computer', 'leg', 'book', 'mouse', 'cup', 'coffee', 'shelf', '[UNK]', 'magazine', 'sofa', 'cord', 'top', 'living', 'chair', 'stand', 'remote', 'can', 'mug', 'cushion', 'antenna', 'seat', 'bowl', 'blanket', 'handle', 'tray', 'monitor', 'television', 'plate', 'box', 'small', 'purse', 'ottoman', 'bottle', 'paper', 'picture', 'backpack'] 2022-03-17 09:38:01,735.735 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'room', 'top', 'book', 'door', 'cup', 'living', 'floor', 'table', 'wall', 'magazine', 'computer', 'edge', 'screen', 'coffee', 'leg', 'bag', 'counter', 'plate', 'bottle', 'couch', 'mouse', 'purse', 'keyboard', 'pillow', 'sofa', 'shelf', 'laptop', 'tile', 'stool', 'rug'] 2022-03-17 09:40:25,225.225 2829:trainer.py:487 do_train_dict(): eta: 3:31:19 iter: 59200 speed: 270.1 images/sec total_norm: 147.8494 (150.2399) loss: 139.1447 (140.8926) masked_loss: 1.4374 (1.4575) tag_loss: 137.5555 (139.4351) time: 1.4316 (1.8954) data: 0.0001 (0.0002) to_device: 0.0050 (0.0050) time_gpu: 1.4262 (1.8903) save_time: 8.8421 (14.7395) lr: 0.000011 max mem: 26307 2022-03-17 09:40:25,587.587 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.59375 2022-03-17 09:40:25,587.587 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 141.48828125 2022-03-17 09:40:25,587.587 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.75340359126537 2022-03-17 09:40:55,144.144 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02355773001909256 2022-03-17 09:40:55,146.146 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 09:40:55,146.146 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'hot', '##dog', 'is', 'being', 'eaten', 'by', 'a', 'man', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 09:40:55,161.161 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hat', 'cap', 'man', 'hair', 'nose', 'head', 'hand', 'dog', 'ear', 'hot', 'jacket', 'woman', 'eye', 'window', 'sign', 'person', 'building', '[UNK]', 'bun', 'foil', 'sweater', 'face', 'door', 'shirt', 'scarf', 'mouth', 'thumb', 'coat', 'light', 'logo', 'letter', 'reflection', 'wall', 'finger', 'sunglasses', 'paper', 'vest', 'backpack', 'girl', 'bag', 'jean', 'food', 'pole', 'glasses', 'strap', 'zipper', 'store', 'sleeve', 'number', 'collar'] 2022-03-17 09:41:11,178.178 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'building', 'door', 'woman', 'hair', 'person', 'hot', 'eye', 'window', 'sign', 'shirt', 'dog', 'nose', 'ear', 'hat', 'cap', 'jacket', 'bow', 'sweater', 'foil', 'scarf', 'bun'] 03-17 09:43:27.657 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-17 09:43:27.657 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-17 09:43:28.924 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-17 09:43:34,721.721 2829:trainer.py:487 do_train_dict(): eta: 3:28:25 iter: 59300 speed: 270.2 images/sec total_norm: 148.4815 (152.0117) loss: 141.4568 (140.8736) masked_loss: 1.4203 (1.4649) tag_loss: 139.8029 (139.4087) time: 1.4312 (1.8950) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4260 (1.8898) save_time: 8.8421 (14.7395) lr: 0.000011 max mem: 26307 2022-03-17 09:43:35,079.079 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.529411792755127 2022-03-17 09:43:35,080.080 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 146.22079467773438 2022-03-17 09:43:35,080.080 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.7579834099972 2022-03-17 09:44:05,006.006 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023516250774264336 2022-03-17 09:44:05,006.006 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 09:44:05,007.007 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'row', 'of', 'red', 'valves', '[MASK]', 'dotted', 'among', 'some', 'shrubs', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 09:44:05,022.022 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['fire', '[UNK]', 'cap', 'bolt', 'wall', 'bush', 'plant', 'top', 'leaf', 'red', 'base', 'knob', 'flower', 'bottom', 'background', 'green', 'chain', 'line', 'foliage', 'sign', 'stem', 'window', 'wheel', 'next', 'ground', 'blue', 'tree', 'reflection', 'rock', 'plug', 'group', 'writing', 'block', 'building', 'dirt', 'toy', 'water', 'nut', 'pipe', 'front', 'side', 'garden', 'silver', 'tile', 'lid', 'picture', 'letter', 'handle', 'area', 'fence'] 2022-03-17 09:44:20,969.969 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'line', 'top', 'red', 'fire', 'wall', 'base', 'plant', 'row', 'bush', 'cap', 'flower', 'bolt'] 2022-03-17 09:46:44,329.329 2829:trainer.py:487 do_train_dict(): eta: 3:25:30 iter: 59400 speed: 270.0 images/sec total_norm: 148.0462 (153.2958) loss: 137.4192 (139.5342) masked_loss: 1.4298 (1.4557) tag_loss: 135.9200 (138.0785) time: 1.4325 (1.8961) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4272 (1.8909) save_time: 8.8421 (14.7395) lr: 0.000011 max mem: 26307 2022-03-17 09:46:44,690.690 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.529411792755127 2022-03-17 09:46:44,690.690 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 126.42683410644531 2022-03-17 09:46:44,691.691 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.75964920300396 2022-03-17 09:47:14,616.616 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023525401949882507 2022-03-17 09:47:14,616.616 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 09:47:14,617.617 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'leaning', 'up', 'against', 'a', 'wall', 'with', 'graffiti', 'and', 'text', '##ing', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 09:47:14,632.632 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'wall', 'short', 'hair', 'boy', 'girl', 'graffiti', 'head', 'hand', '[UNK]', 'person', 'sidewalk', 'man', 'shoe', 'ground', 'leg', 'woman', 'phone', 'sock', 'young', 'building', 'arm', 'group', 'flop', 'foot', 'design', 'window', 'face', 'flip', 'ear', 'shadow', 'floor', 'cell', 'painting', 'hat', 'sky', 'eye', 'sleeve', 'glasses', 'kid', 'drawing', 'tree', 'child', 'picture', 'sweater', 'leaf', 'bat', 'sign', 'curb', 'bench'] 2022-03-17 09:47:30,605.605 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'group', 'door', 'woman', 'short', 'ground', 'hair', 'girl', 'person', 'wall', 'arm', 'boy', 'phone', 'eye', 'shirt', 'leg', 'boot', 'shoe', 'dot', 'sidewalk', 'graffiti'] 2022-03-17 09:49:54,150.150 2829:trainer.py:487 do_train_dict(): eta: 3:22:36 iter: 59500 speed: 269.7 images/sec total_norm: 150.6443 (153.0435) loss: 136.9487 (137.0340) masked_loss: 1.4257 (1.4436) tag_loss: 135.8034 (135.5904) time: 1.4324 (1.8982) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4271 (1.8930) save_time: 8.8421 (14.7395) lr: 0.000010 max mem: 26307 2022-03-17 09:49:54,511.511 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-17 09:49:54,512.512 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 145.71640014648438 2022-03-17 09:49:54,512.512 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.75877954495833 2022-03-17 09:50:24,304.304 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023510891944169998 2022-03-17 09:50:24,304.304 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 09:50:24,305.305 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'young', 'premiered', 'holding', 'a', 'tennis', 'ball', 'and', 'swinging', 'a', '[MASK]', 'ra', '##c', '##quet', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 09:50:24,320.320 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shoe', 'tennis', '[UNK]', 'short', 'court', 'leg', 'sock', 'hand', 'line', 'shirt', 'head', 'shadow', 'hair', 'woman', 'ground', 'arm', 'handle', 'player', 'face', 'logo', 'ball', 'hat', 'ponytail', 'stripe', 'ear', 'cap', 'man', 'string', 'mouth', 'nose', 'band', 'blue', 'girl', 'female', 'skirt', 'knee', 'letter', 'sleeve', 'game', 'young', 'wall', 'top', 'glasses', 'eye', 'wrist', 'male', 'ready', 'outfit', 'necklace', 'white'] 2022-03-17 09:50:40,283.283 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'hand', 'line', 'young', 'player', 'woman', 'court', 'short', 'ground', 'hair', 'arm', 'ball', 'shirt', 'leg', 'handle', 'tennis', 'string', 'shadow', 'shoe', 'ponytail', 'sock'] 2022-03-17 09:53:03,904.904 2829:trainer.py:487 do_train_dict(): eta: 3:19:42 iter: 59600 speed: 269.8 images/sec total_norm: 149.9381 (151.4218) loss: 136.2480 (138.5462) masked_loss: 1.4119 (1.4164) tag_loss: 135.0492 (137.1298) time: 1.4318 (1.8975) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4268 (1.8925) save_time: 8.8421 (14.7395) lr: 0.000010 max mem: 26307 2022-03-17 09:53:04,264.264 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5833333134651184 2022-03-17 09:53:04,264.264 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 158.49700927734375 2022-03-17 09:53:04,264.264 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.75044187229483 2022-03-17 09:53:34,275.275 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02350086346268654 2022-03-17 09:53:34,276.276 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 09:53:34,276.276 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'boat', 'sits', 'on', 'a', 'river', '[MASK]', 'green', 'trees', 'and', '[MASK]', 'around', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 09:53:34,291.291 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'reflection', 'water', 'boat', 'grass', 'window', 'canal', 'light', 'person', 'dock', 'river', 'path', 'bank', 'bottom', 'bridge', 'building', 'roof', 'bush', '[UNK]', 'shore', 'flower', 'stripe', 'body', 'house', 'car', 'trunk', 'door', 'wall', 'top', 'plant', 'flag', 'small', 'sign', 'pole', 'lamp', 'chair', 'tire', 'next', 'post', 'sidewalk', 'man', 'blue', 'sky', 'front', 'shirt', 'umbrella', 'duck', 'bumper', 'shadow', 'can'] 2022-03-17 09:53:50,229.229 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['water', 'river', 'top', 'green', 'bank', 'window', 'tree', 'boat', 'canal', 'bush', 'reflection', 'foliage', 'bumper'] 2022-03-17 09:56:13,820.820 2829:trainer.py:487 do_train_dict(): eta: 3:16:48 iter: 59700 speed: 269.6 images/sec total_norm: 151.9569 (154.5600) loss: 138.4757 (140.1799) masked_loss: 1.3459 (1.3609) tag_loss: 137.1900 (138.8190) time: 1.4321 (1.8992) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4271 (1.8940) save_time: 8.8421 (14.7395) lr: 0.000010 max mem: 26307 2022-03-17 09:56:14,181.181 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6000000238418579 2022-03-17 09:56:14,181.181 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 133.41464233398438 2022-03-17 09:56:14,181.181 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.7534015681034 2022-03-17 09:56:44,281.281 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023498503491282463 2022-03-17 09:56:44,281.281 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 09:56:44,282.282 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'street', 'vendor', 'is', 'sitting', 'under', 'an', 'umbrella', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 09:56:44,297.297 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['road', 'street', 'tire', 'ground', '[UNK]', 'car', 'truck', 'sidewalk', 'window', 'box', 'building', 'tree', 'light', 'sky', 'shirt', 'sign', 'pole', 'man', 'wheel', 'van', 'person', 'cart', 'fence', 'head', 'shoe', 'hair', 'door', 'trash', 'house', 'roof', 'food', 'line', 'hand', 'grass', 'can', 'bag', 'plate', 'woman', 'handle', 'wall', 'basket', 'lot', 'jean', 'curb', 'container', 'mirror', 'parking', 'bike', 'bicycle', 'windshield'] 2022-03-17 09:57:00,380.380 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'line', 'building', 'road', 'street', 'cup', 'car', 'seat', 'cover', 'plant', 'tree', 'shirt', 'bowl', 'truck', 'mirror', 'pole', 'bike', 'pot', 'motorcycle', 'banner', 'basket', 'cart', 'sidewalk', 'tire', 'umbrella', 'bucket', 'rack', 'vendor'] 2022-03-17 09:59:23,929.929 2829:trainer.py:487 do_train_dict(): eta: 3:13:54 iter: 59800 speed: 269.3 images/sec total_norm: 148.0454 (149.6901) loss: 134.8782 (135.0346) masked_loss: 1.3867 (1.3873) tag_loss: 133.4146 (133.6473) time: 1.4325 (1.9011) data: 0.0001 (0.0005) to_device: 0.0051 (0.0051) time_gpu: 1.4272 (1.8955) save_time: 8.8421 (14.7395) lr: 0.000010 max mem: 26307 2022-03-17 09:59:24,292.292 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5555555820465088 2022-03-17 09:59:24,292.292 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 165.06137084960938 2022-03-17 09:59:24,292.292 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.75623725929324 2022-03-17 09:59:54,405.405 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02354694902896881 2022-03-17 09:59:54,405.405 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 09:59:54,405.405 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'and', 'son', 'joking', 'around', 'while', '[MASK]', 'the', 'park', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 09:59:54,420.420 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'woman', 'hair', 'hand', '[UNK]', 'face', 'head', 'nose', 'ear', 'arm', 'eye', 'man', 'camera', 'mouth', 'top', 'glasses', 'person', 'tank', 'grass', 'girl', 'sunglasses', 'tree', 'boy', 'leaf', 'hat', 'sleeve', 'couple', 'jean', 'flower', 'plate', 'dress', 'young', 'button', 'cap', 'pocket', 'short', 'vest', 'tattoo', 'design', 'wall', 'watch', 'white', 'black', 'next', 'pole', 'plant', 'microphone', 'lid', 'shoe', 'drum'] 2022-03-17 10:00:10,409.409 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'top', 'son', 'park', 'woman', 'hair', 'girl', 'arm', 'boy', 'eye', 'tree', 'shirt', 'nose', 'ear', 'tank', 'bottle', 'disc', 'glasses', 'sunglasses'] 2022-03-17 10:02:33,801.801 2829:trainer.py:487 do_train_dict(): eta: 3:10:59 iter: 59900 speed: 269.7 images/sec total_norm: 149.5469 (152.0728) loss: 139.1802 (139.9196) masked_loss: 1.4383 (1.4619) tag_loss: 137.6618 (138.4577) time: 1.4319 (1.8987) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4270 (1.8937) save_time: 8.8421 (14.7395) lr: 0.000010 max mem: 26307 2022-03-17 10:02:34,164.164 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6666666865348816 2022-03-17 10:02:34,164.164 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 123.70503234863281 2022-03-17 10:02:34,164.164 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.76532523473104 2022-03-17 10:03:04,303.303 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02358384057879448 2022-03-17 10:03:04,303.303 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 10:03:04,303.303 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'shelf', 'with', 'bowls', 'lined', '[MASK]', '[MASK]', 'it', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 10:03:04,319.319 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['bowl', 'wall', 'shelf', 'vase', 'pot', '[UNK]', 'handle', 'bird', 'plant', 'pitcher', 'table', 'rim', 'design', 'ledge', 'bathroom', 'bowls', 'line', 'cup', 'base', 'frame', 'jug', 'leaf', 'flower', 'lid', 'mirror', 'top', 'window', 'ceramic', 'container', 'outlet', 'picture', 'wood', 'jar', 'cabinet', 'book', 'tree', 'item', 'bottom', 'door', 'toilet', 'stem', 'white', 'light', 'paper', 'fruit', 'counter', 'coffee', 'tea', 'sink', 'mug'] 2022-03-17 10:03:20,278.278 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'wall', 'paper', 'plant', 'coffee', 'bowl', 'bird', 'handle', 'leaf', 'item', 'pot', 'shelf', 'lid', 'bowls', 'jar', 'vase', 'jug'] 2022-03-17 10:05:43,614.614 2829:trainer.py:487 do_train_dict(): eta: 3:08:05 iter: 60000 speed: 269.7 images/sec total_norm: 148.3866 (150.4786) loss: 138.7343 (141.0275) masked_loss: 1.3430 (1.4124) tag_loss: 137.4193 (139.6151) time: 1.4305 (1.8981) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4253 (1.8930) save_time: 8.8421 (14.7395) lr: 0.000010 max mem: 26307 2022-03-17 10:05:43,617.617 2829:checkpoint.py:222 save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt 2022-03-17 10:05:53,024.024 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6060606241226196 2022-03-17 10:05:53,024.024 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 143.82579040527344 2022-03-17 10:05:53,024.024 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.76604342976347 2022-03-17 10:06:23,003.003 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02360193431377411 2022-03-17 10:06:23,003.003 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 10:06:23,004.004 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'holding', 'up', 'a', '[MASK]', 'phone', 'at', 'a', 'desk', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 10:06:23,019.019 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['screen', 'wall', 'tablet', 'laptop', 'thumb', 'hand', 'keyboard', 'finger', 'phone', 'picture', 'computer', 'button', 'person', 'key', 'nail', 'desk', 'cover', '[UNK]', 'cord', 'icon', 'letter', 'logo', 'cell', 'train', 'game', 'wire', 'speaker', 'case', 'monitor', 'device', 'frame', 'text', 'man', 'building', 'remote', 'front', 'box', 'ring', 'floor', 'table', 'ship', 'iphone', 'black', 'wheel', 'smart', 'palm', 'container', 'open', 'paper', 'writing'] 2022-03-17 10:06:38,857.857 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'person', 'table', 'wall', 'phone', 'key', 'paper', 'computer', 'cell', 'picture', 'screen', 'finger', 'desk', 'button', 'remote', 'thumb', 'monitor', 'logo', 'keyboard', 'laptop', 'tablet'] 2022-03-17 10:09:01,521.521 2829:trainer.py:487 do_train_dict(): eta: 3:05:11 iter: 60100 speed: 258.7 images/sec total_norm: 149.0059 (152.4156) loss: 136.8327 (137.3492) masked_loss: 1.3748 (1.3992) tag_loss: 135.5767 (135.9500) time: 1.4303 (1.9791) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4252 (1.8836) save_time: 8.8421 (14.2643) lr: 0.000010 max mem: 26307 2022-03-17 10:09:01,882.882 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-17 10:09:01,883.883 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 136.20635986328125 2022-03-17 10:09:01,883.883 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.77072749977492 2022-03-17 10:09:31,935.935 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023635121062397957 2022-03-17 10:09:31,936.936 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 10:09:31,936.936 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'line', 'of', 'two', '[MASK]', 'parked', 'on', 'the', 'side', 'of', '[MASK]', 'road', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 10:09:31,952.952 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['road', 'bus', 'street', 'sidewalk', 'window', 'pole', 'tire', 'line', 'building', '[UNK]', 'person', 'windshield', 'sign', 'post', 'tree', 'door', 'wheel', 'light', 'plate', 'sky', 'front', 'fence', 'man', 'jacket', 'car', 'license', 'curb', 'woman', 'shirt', 'mirror', 'bag', 'city', 'wall', 'lamp', 'roof', 'jean', 'coat', 'hair', 'bench', 'stop', 'number', 'bush', 'railing', 'yellow', 'can', 'driver', 'logo', 'bike', 'trash', 'motorcycle'] 2022-03-17 10:09:47,783.783 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'city', 'man', 'side', 'line', 'building', 'road', 'street', 'light', 'post', 'person', 'window', 'sign', 'bus', 'traffic', 'flag', 'wheel', 'panel', 'pole', 'banner', 'lamp', 'balcony', 'sidewalk', 'tire', 'windshield'] 2022-03-17 10:12:11,491.491 2829:trainer.py:487 do_train_dict(): eta: 3:02:17 iter: 60200 speed: 269.5 images/sec total_norm: 148.8860 (151.3608) loss: 138.3056 (138.7536) masked_loss: 1.3577 (1.3690) tag_loss: 137.4783 (137.3847) time: 1.4317 (1.8997) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4265 (1.8945) save_time: 8.8421 (14.2643) lr: 0.000009 max mem: 26307 2022-03-17 10:12:11,850.850 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6756756901741028 2022-03-17 10:12:11,851.851 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 148.8243408203125 2022-03-17 10:12:11,851.851 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.77542744979732 2022-03-17 10:12:42,016.016 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023659884929656982 2022-03-17 10:12:42,017.017 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 10:12:42,017.017 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'sitting', 'at', 'a', 'kitchen', 'table', 'opens', 'wide', 'to', 'take', 'the', 'first', 'bite', '[MASK]', 'a', 'sandwich', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 10:12:42,032.032 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['candle', 'table', 'hair', 'plate', 'bottle', 'man', 'shirt', 'window', 'hand', 'sweater', 'flame', 'restaurant', '[UNK]', 'mouth', 'sandwich', 'light', 'wall', 'fork', 'water', 'food', 'glass', 'napkin', 'holder', 'label', 'bowl', 'woman', 'head', 'chair', 'bread', 'hot', 'person', 'liquid', 'knife', 'nose', 'face', 'tattoo', 'ring', 'ceiling', 'cake', 'straw', 'bun', 'arm', 'dog', 'salt', 'spoon', 'container', 'flower', 'room', 'front', 'boy'] 2022-03-17 10:12:58,049.049 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'first', 'man', 'hand', 'water', 'light', 'hair', 'mouth', 'person', 'table', 'food', 'wide', 'hot', 'window', 'shirt', 'kitchen', 'dog', 'teeth', 'restaurant', 'plate', 'cabinet', 'bottle', 'bite', 'bread', 'flame', 'holder', 'fork', 'lighter', 'towel', 'sandwich', 'tattoo', 'candle', 'sweater', 'napkin'] 03-17 10:13:28.973 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-17 10:13:28.973 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-17 10:13:30.235 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-17 10:15:21,591.591 2829:trainer.py:487 do_train_dict(): eta: 2:59:22 iter: 60300 speed: 269.3 images/sec total_norm: 149.8289 (150.9457) loss: 136.7347 (137.5155) masked_loss: 1.4130 (1.4382) tag_loss: 135.4882 (136.0772) time: 1.4320 (1.9010) data: 0.0001 (0.0002) to_device: 0.0050 (0.0050) time_gpu: 1.4267 (1.8959) save_time: 8.8421 (14.2643) lr: 0.000009 max mem: 26307 2022-03-17 10:15:21,951.951 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7428571581840515 2022-03-17 10:15:21,951.951 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 117.19361877441406 2022-03-17 10:15:21,952.952 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.77935245968648 2022-03-17 10:15:51,924.924 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023659594357013702 2022-03-17 10:15:51,925.925 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 10:15:51,925.925 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'dog', 'that', '[MASK]', 'herd', '##ing', 'two', 'sheep', 'while', 'people', 'watch', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 10:15:51,941.941 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['person', 'sheep', 'grass', 'dog', 'chair', '[UNK]', 'hat', 'shirt', 'man', 'jacket', 'woman', 'fence', 'field', 'pole', 'head', 'leg', 'bag', 'jean', 'shoe', 'flag', 'umbrella', 'hair', 'tree', 'stand', 'face', 'wool', 'ground', 'boy', 'sign', 'child', 'tail', 'girl', 'spectator', 'line', 'cap', 'table', 'lamb', 'dirt', 'group', 'coat', 'horn', 'hand', 'helmet', 'animal', 'wall', 'post', 'goat', 'sky', 'net', 'bat'] 2022-03-17 10:16:07,853.853 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'line', 'woman', 'field', 'hair', 'person', 'table', 'boy', 'chair', 'jean', 'shirt', 'dog', 'animal', 'leg', 'grass', 'hat', 'jacket', 'bench', 'dirt', 'sheep', 'fence'] 2022-03-17 10:18:31,835.835 2829:trainer.py:487 do_train_dict(): eta: 2:56:28 iter: 60400 speed: 269.1 images/sec total_norm: 148.0683 (151.8012) loss: 140.3271 (140.4253) masked_loss: 1.3901 (1.4215) tag_loss: 138.8503 (139.0038) time: 1.4316 (1.9024) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4263 (1.8973) save_time: 8.8421 (14.2643) lr: 0.000009 max mem: 26307 2022-03-17 10:18:32,196.196 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-17 10:18:32,196.196 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 117.75284576416016 2022-03-17 10:18:32,196.196 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.78705593771186 2022-03-17 10:19:02,443.443 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023665286600589752 2022-03-17 10:19:02,443.443 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 10:19:02,443.443 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'picture', 'of', '[MASK]', 'cows', 'eating', 'grass', '[MASK]', 'a', 'field', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 10:19:02,459.459 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['grass', 'cow', 'tree', 'head', 'field', 'fence', 'leg', 'sky', 'tail', 'bush', 'face', 'plant', 'cloud', 'post', 'building', 'pasture', 'ear', 'green', 'grassy', 'ground', 'pole', 'flower', 'background', 'roof', '[UNK]', 'cattle', 'hill', 'house', 'leaf', 'grazing', 'weed', 'mountain', 'brown', 'white', 'herd', 'lush', 'top', 'animal', 'group', 'trunk', 'horse', 'bird', 'branch', 'sheep', 'barn', 'area', 'open', 'rock', 'horn', 'neck'] 2022-03-17 10:19:18,392.392 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'field', 'post', 'tree', 'sky', 'picture', 'leg', 'grass', 'bush', 'cloud', 'fence', 'cow', 'weed'] 2022-03-17 10:21:42,109.109 2829:trainer.py:487 do_train_dict(): eta: 2:53:33 iter: 60500 speed: 269.1 images/sec total_norm: 147.7416 (151.3848) loss: 138.9736 (142.1665) masked_loss: 1.4013 (1.4407) tag_loss: 137.5656 (140.7258) time: 1.4303 (1.9028) data: 0.0001 (0.0001) to_device: 0.0052 (0.0051) time_gpu: 1.4251 (1.8975) save_time: 8.8421 (14.2643) lr: 0.000009 max mem: 26307 2022-03-17 10:21:42,470.470 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.46875 2022-03-17 10:21:42,470.470 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 160.2852783203125 2022-03-17 10:21:42,471.471 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.77744458768234 2022-03-17 10:22:12,979.979 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023684196174144745 2022-03-17 10:22:12,979.979 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 10:22:12,979.979 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'computer', 'keyboards', 'are', 'displayed', 'on', '[MASK]', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 10:22:12,995.995 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['desk', 'keyboard', 'table', 'key', 'mouse', 'cord', 'paper', 'wire', 'button', 'logo', 'computer', '[UNK]', 'wall', 'book', 'phone', 'plug', 'wooden', 'handle', 'base', 'black', 'pad', 'next', 'top', 'pen', 'telephone', 'cup', 'cable', 'ipod', 'box', 'speaker', 'mug', 'object', 'outlet', 'knob', 'writing', 'circle', 'light', 'monitor', 'white', 'camera', 'small', 'letter', 'coffee', 'floor', 'cap', 'container', 'board', 'other', 'bag', 'stand'] 2022-03-17 10:22:28,894.894 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['book', 'table', 'phone', 'key', 'paper', 'computer', 'cell', 'desk', 'button', 'wire', 'mouse', 'logo', 'keyboard', 'cord'] 2022-03-17 10:24:52,616.616 2829:trainer.py:487 do_train_dict(): eta: 2:50:38 iter: 60600 speed: 268.8 images/sec total_norm: 147.0846 (149.8204) loss: 137.7735 (137.5887) masked_loss: 1.3666 (1.3823) tag_loss: 135.8464 (136.2064) time: 1.4323 (1.9050) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4270 (1.8998) save_time: 8.8421 (14.2643) lr: 0.000009 max mem: 26307 2022-03-17 10:24:52,977.977 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.625 2022-03-17 10:24:52,977.977 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 134.27268981933594 2022-03-17 10:24:52,977.977 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.78060899888467 2022-03-17 10:25:23,499.499 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02371017448604107 2022-03-17 10:25:23,499.499 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 10:25:23,500.500 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'group', 'of', 'kids', 'in', 'a', '[MASK]', 'plays', 'with', 'a', '[MASK]', 'outside', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 10:25:23,515.515 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['grass', 'person', 'shirt', 'tree', 'man', 'ground', 'field', '[UNK]', 'short', 'leg', 'hat', 'head', 'hair', 'cap', 'shadow', 'arm', 'group', 'shoe', 'woman', 'hand', 'girl', 'park', 'background', 'dog', 'bag', 'boy', 'jersey', 'pole', 'bat', 'bird', 'ball', 'kite', 'baseball', 'child', 'sock', 'player', 'uniform', 'grassy', 'green', 'can', 'jean', 'game', 'crowd', 'jacket', 'bush', 'tail', 'trash', 'cone', 'soccer', 'flag'] 2022-03-17 10:25:39,405.405 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'group', 'air', 'park', 'field', 'ground', 'hair', 'girl', 'person', 'child', 'boy', 'couple', 'tree', 'sky', 'jean', 'shirt', 'dress', 'string', 'shadow', 'grass', 'cloud', 'trunk', 'kit', 'skirt', 'kite'] 2022-03-17 10:28:03,002.002 2829:trainer.py:487 do_train_dict(): eta: 2:47:44 iter: 60700 speed: 268.9 images/sec total_norm: 149.5002 (151.8643) loss: 136.4660 (136.8315) masked_loss: 1.3754 (1.4175) tag_loss: 135.2537 (135.4139) time: 1.4309 (1.9039) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4258 (1.8988) save_time: 8.8421 (14.2643) lr: 0.000009 max mem: 26307 2022-03-17 10:28:03,362.362 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6000000238418579 2022-03-17 10:28:03,362.362 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 122.81192016601562 2022-03-17 10:28:03,363.363 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.7867005687011 2022-03-17 10:28:33,927.927 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02372957579791546 2022-03-17 10:28:33,927.927 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 10:28:33,927.927 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'flock', 'of', 'sheep', 'climbing', 'up', '[MASK]', 'crest', 'of', '[MASK]', 'hill', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 10:28:33,943.943 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'sky', 'sheep', 'grass', 'head', 'leg', 'field', 'mountain', 'hill', 'ear', 'face', 'herd', 'background', 'group', 'bush', 'ground', 'wool', '[UNK]', 'tail', 'cow', 'grassy', 'grazing', 'bucket', 'open', 'lamb', 'trough', 'house', 'animal', 'container', 'building', 'green', 'standing', 'next', 'spot', 'rock', 'distance', 'pasture', 'stand', 'top', 'large', 'other', 'flock', 'hay', 'middle', 'nose', 'bunch', 'white', 'couple', 'area', 'food'] 2022-03-17 10:28:49,908.908 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'group', 'face', 'field', 'hill', 'mountain', 'metal', 'tree', 'sky', 'leg', 'ear', 'grass', 'bush', 'sheep', 'crest', 'herd', 'flock'] 2022-03-17 10:31:13,531.531 2829:trainer.py:487 do_train_dict(): eta: 2:44:49 iter: 60800 speed: 268.7 images/sec total_norm: 149.1399 (151.9852) loss: 136.9570 (137.5027) masked_loss: 1.3578 (1.3858) tag_loss: 135.9295 (136.1169) time: 1.4315 (1.9053) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4262 (1.9002) save_time: 8.8421 (14.2643) lr: 0.000008 max mem: 26307 2022-03-17 10:31:13,892.892 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-17 10:31:13,892.892 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 154.19613647460938 2022-03-17 10:31:13,892.892 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.78898504723861 2022-03-17 10:31:44,388.388 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023804496973752975 2022-03-17 10:31:44,388.388 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 10:31:44,388.388 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'riding', 'a', 'speed', 'boat', 'on', '##wani', 'of', 'a', 'body', 'of', 'water', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 10:31:44,404.404 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['water', 'wall', 'boat', 'man', 'snow', 'person', 'plant', 'stripe', 'tree', 'fence', 'shirt', '[UNK]', 'jacket', 'head', 'bush', 'hair', 'ski', 'building', 'wave', 'hat', 'glove', 'rope', 'umbrella', 'woman', 'bottom', 'front', 'writing', 'hand', 'motor', 'pot', 'step', 'flag', 'top', 'car', 'handle', 'stair', 'windshield', 'vest', 'railing', 'number', 'small', 'ground', 'helmet', 'hood', 'coat', 'light', 'pole', 'dock', 'logo', 'window'] 2022-03-17 10:32:00,322.322 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'water', 'body', 'top', 'front', 'person', 'wall', 'speed', 'plant', 'tree', 'border', 'bottom', 'boat', 'snow', 'jacket', 'hood', 'helmet', 'glove', 'stripe'] 2022-03-17 10:34:24,122.122 2829:trainer.py:487 do_train_dict(): eta: 2:41:54 iter: 60900 speed: 268.6 images/sec total_norm: 149.0155 (151.4459) loss: 140.3364 (140.3380) masked_loss: 1.4813 (1.4576) tag_loss: 139.0472 (138.8804) time: 1.4318 (1.9059) data: 0.0001 (0.0005) to_device: 0.0051 (0.0050) time_gpu: 1.4267 (1.9004) save_time: 8.8421 (14.2643) lr: 0.000008 max mem: 26307 2022-03-17 10:34:24,483.483 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-17 10:34:24,483.483 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 143.6436767578125 2022-03-17 10:34:24,484.484 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.79033829110568 2022-03-17 10:34:54,872.872 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.0238014105707407 2022-03-17 10:34:54,873.873 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 10:34:54,874.874 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'horse', 'kicking', 'another', '[MASK]', 'in', 'the', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 10:34:54,889.889 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['leg', 'grass', 'fence', 'horse', 'tail', '[UNK]', 'head', 'mane', 'zebra', 'wood', 'wall', 'ground', 'log', 'animal', 'person', 'ear', 'short', 'pole', 'building', 'shirt', 'harness', 'man', 'field', 'hair', 'leaf', 'rope', 'gate', 'nose', 'shoe', 'eye', 'mouth', 'dog', 'jean', 'neck', 'window', 'rock', 'tree', 'jacket', 'bag', 'stick', 'enclosure', 'board', 'hand', 'wooden', 'roof', 'goat', 'legs', 'green', 'box', 'next'] 2022-03-17 10:35:10,822.822 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'building', 'field', 'ground', 'hair', 'wall', 'goal', 'neck', 'horse', 'shirt', 'leg', 'roof', 'ear', 'grass', 'tail', 'pole', 'fence', 'log', 'shoe', 'harness', 'vest', 'mane', 'zebra'] 2022-03-17 10:37:34,803.803 2829:trainer.py:487 do_train_dict(): eta: 2:38:59 iter: 61000 speed: 268.5 images/sec total_norm: 149.0821 (151.2864) loss: 140.4911 (140.9709) masked_loss: 1.3454 (1.4337) tag_loss: 139.1099 (139.5371) time: 1.4306 (1.9067) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4256 (1.9016) save_time: 8.8421 (14.2643) lr: 0.000008 max mem: 26307 2022-03-17 10:37:35,164.164 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6969696879386902 2022-03-17 10:37:35,164.164 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 157.09637451171875 2022-03-17 10:37:35,164.164 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.79372715052529 2022-03-17 10:38:05,830.830 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023831909522414207 2022-03-17 10:38:05,830.830 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 10:38:05,830.830 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'mean', 'are', 'standing', 'in', 'the', 'open', 'door', '[unused53]', 'a', 'city', 'bus', '.', '##ei', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 10:38:05,845.845 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['window', 'head', 'train', 'roof', 'shirt', 'man', 'sky', 'door', 'hair', 'tree', 'bus', '[UNK]', 'stripe', 'woman', 'hand', 'face', 'car', 'person', 'number', 'letter', 'hat', 'sign', 'arm', 'top', 'skirt', 'dress', 'light', 'logo', 'jean', 'handle', 'track', 'building', 'mirror', 'leg', 'shadow', 'curtain', 'tire', 'pole', 'boy', 'seat', 'lady', 'bag', 'wheel', 'passenger', 'street', 'ground', 'jacket', 'scarf', 'cloud', 'shoe'] 2022-03-17 10:38:21,741.741 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['city', 'head', 'man', 'hand', 'open', 'door', 'car', 'hair', 'person', 'boy', 'window', 'train', 'bar', 'tree', 'watch', 'box', 'letter', 'sign', 'sky', 'jean', 'shirt', 'bus', 'roof', 'passenger', 'billboard', 'handle', 'hat', 'pole', 'hood', 'logo', 'cab', 'taxi', 'trolley'] 2022-03-17 10:40:45,330.330 2829:trainer.py:487 do_train_dict(): eta: 2:36:04 iter: 61100 speed: 268.7 images/sec total_norm: 147.3039 (150.5710) loss: 136.7030 (137.7486) masked_loss: 1.3435 (1.3712) tag_loss: 135.2228 (136.3774) time: 1.4314 (1.9053) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4261 (1.9001) save_time: 8.8421 (14.2643) lr: 0.000008 max mem: 26307 2022-03-17 10:40:45,691.691 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5757575631141663 2022-03-17 10:40:45,691.691 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 132.9183349609375 2022-03-17 10:40:45,691.691 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.7994185117335 2022-03-17 10:41:16,084.084 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023850420489907265 2022-03-17 10:41:16,085.085 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 10:41:16,086.086 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'large', '[MASK]', 'polar', '[MASK]', 'standing', 'on', 'a', 'icy', 'pool', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 10:41:16,102.102 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['bear', 'ear', 'water', 'polar', 'head', 'nose', 'rock', 'leg', 'eye', 'mouth', 'snow', 'ground', 'paw', 'pool', 'shadow', 'face', 'claw', 'ice', 'fur', 'teeth', 'ball', 'wall', 'zoo', 'boulder', 'white', 'large', 'sand', 'object', 'stone', 'swimming', '[UNK]', 'tail', 'splash', 'bubble', 'background', 'big', 'grass', 'top', 'tongue', 'wave', 'next', 'toy', 'neck', 'standing', 'structure', 'tree', 'ledge', 'snout', 'foam', 'exhibit'] 2022-03-17 10:41:32,029.029 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'face', 'water', 'large', 'white', 'ground', 'rock', 'mouth', 'eye', 'leg', 'nose', 'ear', 'bear', 'snow', 'pool', 'handle', 'polar', 'sidewalk', 'cone', 'boulder', 'icy', 'curb', 'weed'] 03-17 10:43:30.336 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-17 10:43:30.336 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-17 10:43:31.430 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 95}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 98}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 97}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}] 2022-03-17 10:43:55,801.801 2829:trainer.py:487 do_train_dict(): eta: 2:33:09 iter: 61200 speed: 268.8 images/sec total_norm: 148.4808 (151.6336) loss: 137.6155 (138.9650) masked_loss: 1.3438 (1.3520) tag_loss: 136.1368 (137.6131) time: 1.4304 (1.9048) data: 0.0001 (0.0002) to_device: 0.0052 (0.0050) time_gpu: 1.4252 (1.8996) save_time: 8.8421 (14.2643) lr: 0.000008 max mem: 26307 2022-03-17 10:43:56,161.161 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6666666865348816 2022-03-17 10:43:56,161.161 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 114.75300598144531 2022-03-17 10:43:56,161.161 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.80511071981458 2022-03-17 10:44:26,725.725 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023850349709391594 2022-03-17 10:44:26,726.726 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 10:44:26,726.726 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'white', 'toilet', 'with', 'a', 'wooden', 'lid', '[MASK]', 'toilet', 'paper', 'sitting', '[MASK]', 'top', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 10:44:26,741.741 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['toilet', 'wall', 'floor', 'paper', 'bathroom', 'roll', 'seat', 'pipe', 'tile', 'lid', 'bowl', 'holder', 'hole', 'handle', 'brush', '[UNK]', 'tank', 'line', 'carpet', 'water', 'bottle', 'wooden', 'ground', 'bar', 'wood', 'small', 'container', 'can', 'label', 'next', 'tape', 'trash', 'door', 'tube', 'white', 'metal', 'top', 'tissue', 'sink', 'towel', 'cap', 'hose', 'restroom', 'close', 'knob', 'open', 'bucket', 'object', 'ring', 'soap'] 2022-03-17 10:44:42,718.718 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'white', 'top', 'floor', 'wall', 'seat', 'paper', 'label', 'bowl', 'wooden', 'handle', 'bathroom', 'bottle', 'brush', 'pipe', 'carpet', 'container', 'toilet', 'lid', 'tile'] 2022-03-17 10:47:06,158.158 2829:trainer.py:487 do_train_dict(): eta: 2:30:14 iter: 61300 speed: 269.0 images/sec total_norm: 146.8184 (148.2614) loss: 133.5750 (135.4241) masked_loss: 1.4249 (1.4418) tag_loss: 132.1449 (133.9823) time: 1.4315 (1.9035) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4265 (1.8984) save_time: 8.8421 (14.2643) lr: 0.000008 max mem: 26307 2022-03-17 10:47:06,518.518 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5625 2022-03-17 10:47:06,519.519 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 139.29115295410156 2022-03-17 10:47:06,519.519 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.80900373055026 2022-03-17 10:47:37,478.478 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023856952786445618 2022-03-17 10:47:37,479.479 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 10:47:37,479.479 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'bear', 'is', '[MASK]', 'behind', 'a', 'chain', '[MASK]', 'fence', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 10:47:37,494.494 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['fence', 'bear', 'leaf', 'tree', 'ground', 'ear', 'head', 'pole', 'rock', 'plant', 'trunk', 'animal', 'bush', 'black', 'face', 'leg', 'cow', 'eye', '[UNK]', 'snout', 'large', 'mouth', 'log', 'wire', 'nose', 'area', 'branch', 'brown', 'zoo', 'grass', 'post', 'forest', 'weed', 'tag', 'tongue', 'next', 'enclosure', 'car', 'link', 'big', 'walking', 'neck', 'small', 'building', 'top', 'standing', 'window', 'field', 'baby', 'collar'] 2022-03-17 10:47:53,388.388 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'face', 'ground', 'mouth', 'plant', 'tree', 'tongue', 'nose', 'bear', 'chain', 'link', 'grass', 'leaf', 'trunk', 'fence', 'cow'] 2022-03-17 10:50:16,779.779 2829:trainer.py:487 do_train_dict(): eta: 2:27:19 iter: 61400 speed: 268.6 images/sec total_norm: 147.6159 (149.4563) loss: 139.9171 (139.4900) masked_loss: 1.3465 (1.3837) tag_loss: 138.0495 (138.1063) time: 1.4322 (1.9062) data: 0.0001 (0.0001) to_device: 0.0052 (0.0051) time_gpu: 1.4271 (1.9010) save_time: 8.8421 (14.2643) lr: 0.000008 max mem: 26307 2022-03-17 10:50:17,141.141 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6060606241226196 2022-03-17 10:50:17,141.141 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 131.79714965820312 2022-03-17 10:50:17,141.141 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.81017521183665 2022-03-17 10:50:48,056.056 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023853259161114693 2022-03-17 10:50:48,056.056 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 10:50:48,056.056 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'dead', '[MASK]', '[MASK]', 'bears', 'on', 'display', 'at', 'a', 'museum', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 10:50:48,072.072 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'bear', 'ear', 'head', 'sky', 'cloud', 'leg', 'wall', 'paw', 'rock', 'nose', 'ground', 'log', 'trunk', 'wood', 'eye', 'mountain', 'zoo', 'face', 'plant', 'brown', 'branch', 'hole', 'large', 'bush', 'stump', 'grass', 'shadow', 'enclosure', 'stick', 'foot', 'claw', 'bark', 'cliff', 'leaf', 'top', 'forest', 'mouth', 'hill', 'next', 'small', 'floor', 'stone', 'formation', 'arm', 'dirt', 'tail', 'museum', 'animal', 'brick'] 2022-03-17 10:51:04,012.012 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'ground', 'rock', 'museum', 'dead', 'brown', 'eye', 'tree', 'wood', 'sky', 'leg', 'nose', 'ear', 'bear', 'display', 'cloud', 'log', 'paw'] 2022-03-17 10:53:27,482.482 2829:trainer.py:487 do_train_dict(): eta: 2:24:24 iter: 61500 speed: 268.5 images/sec total_norm: 147.8613 (150.6930) loss: 135.0694 (136.9971) masked_loss: 1.3657 (1.4089) tag_loss: 133.6291 (135.5882) time: 1.4313 (1.9071) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4261 (1.9018) save_time: 8.8421 (14.2643) lr: 0.000007 max mem: 26307 2022-03-17 10:53:27,845.845 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6363636255264282 2022-03-17 10:53:27,846.846 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 104.88545227050781 2022-03-17 10:53:27,846.846 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.81883064492956 2022-03-17 10:53:58,773.773 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023857463151216507 2022-03-17 10:53:58,773.773 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 10:53:58,773.773 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'sleeps', 'on', 'techniques', 'red', 'carpet', 'with', 'tennis', 'shoes', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 10:53:58,789.789 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['cat', 'head', 'ear', '[UNK]', 'shoe', 'carpet', 'fur', 'nose', 'paw', 'floor', 'wall', 'leg', 'red', 'white', 'cord', 'face', 'couch', 'eye', 'back', 'tail', 'body', 'person', 'rug', 'hand', 'blanket', 'spot', 'top', 'next', 'toy', 'gray', 'wire', 'string', 'pair', 'black', 'blue', 'grey', 'stripe', 'kitten', 'pink', 'sock', 'small', 'logo', 'foot', 'sleeping', 'light', 'close', 'chair', 'bag', 'playing', 'ball'] 2022-03-17 10:54:14,765.765 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'red', 'wall', 'nose', 'ear', 'cat', 'tennis', 'fur', 'carpet', 'shoe', 'paw', 'sleeps'] 2022-03-17 10:56:38,056.056 2829:trainer.py:487 do_train_dict(): eta: 2:21:29 iter: 61600 speed: 268.7 images/sec total_norm: 147.9398 (152.3039) loss: 140.0132 (140.6528) masked_loss: 1.3359 (1.3725) tag_loss: 138.6502 (139.2804) time: 1.4314 (1.9057) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4261 (1.9006) save_time: 8.8421 (14.2643) lr: 0.000007 max mem: 26307 2022-03-17 10:56:38,417.417 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6176470518112183 2022-03-17 10:56:38,418.418 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 156.3429718017578 2022-03-17 10:56:38,418.418 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.81930265627572 2022-03-17 10:57:09,554.554 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023935874924063683 2022-03-17 10:57:09,555.555 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 10:57:09,555.555 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'in', 'the', '[MASK]', ',', 'a', 'lady', 'getting', 'something', 'from', 'the', '[MASK]', 'while', 'a', 'man', 'is', 'putting', 'something', 'in', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 10:57:09,570.570 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'kitchen', 'hair', '[UNK]', 'cabinet', 'glasses', 'refrigerator', 'man', 'woman', 'bottle', 'window', 'hand', 'door', 'handle', 'bowl', 'food', 'head', 'floor', 'pot', 'wall', 'drawer', 'pan', 'table', 'arm', 'person', 'jean', 'towel', 'can', 'paper', 'box', 'face', 'bag', 'knife', 'napkin', 'apron', 'lid', 'lady', 'stove', 'girl', 'container', 'ear', 'short', 'jug', 'board', 'plate', 'light', 'cup', 'ceiling', 'shoe', 'shelf'] 2022-03-17 10:57:25,617.617 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'can', 'man', 'face', 'something', 'door', 'light', 'woman', 'hair', 'floor', 'table', 'wall', 'food', 'lady', 'window', 'box', 'jean', 'shirt', 'kitchen', 'dress', 'bowl', 'handle', 'cabinet', 'bottle', 'ceiling', 'pan', 'glasses', 'cloth', 'pot', 'towel', 'trash', 'lid', 'stove', 'oven', 'refrigerator', 'jug'] 2022-03-17 10:59:48,861.861 2829:trainer.py:487 do_train_dict(): eta: 2:18:34 iter: 61700 speed: 268.3 images/sec total_norm: 148.6898 (151.4357) loss: 137.3496 (139.4405) masked_loss: 1.3560 (1.3984) tag_loss: 135.9817 (138.0421) time: 1.4312 (1.9080) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4261 (1.9029) save_time: 8.8421 (14.2643) lr: 0.000007 max mem: 26307 2022-03-17 10:59:49,222.222 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.8055555820465088 2022-03-17 10:59:49,223.223 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 170.92286682128906 2022-03-17 10:59:49,223.223 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.82107164327381 2022-03-17 11:00:20,436.436 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023950770497322083 2022-03-17 11:00:20,436.436 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 11:00:20,436.436 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'there', 'is', 'a', 'dog', 'sitting', '[MASK]', 'the', 'cart', 'of', 'a', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 11:00:20,451.451 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['car', 'building', 'sky', 'window', 'tire', 'wheel', 'sign', 'bike', 'shadow', 'motorcycle', 'light', 'seat', 'door', 'pole', 'handle', 'windshield', 'road', 'street', 'ground', 'line', '[UNK]', 'roof', 'traffic', 'mirror', 'helmet', 'flag', 'mountain', 'truck', 'person', 'logo', 'tree', 'gas', 'man', 'background', 'white', 'cone', 'arrow', 'vehicle', 'shirt', 'side', 'bicycle', 'billboard', 'lot', 'parking', 'parked', 'next', 'fence', 'silver', 'sidewalk', 'fender'] 2022-03-17 11:00:36,412.412 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'line', 'building', 'door', 'road', 'street', 'light', 'car', 'window', 'sign', 'sky', 'dog', 'vehicle', 'handle', 'shadow', 'flag', 'wheel', 'tail', 'pole', 'bike', 'motorcycle', 'helmet', 'cart', 'tire', 'harness', 'windshield', 'paw'] 2022-03-17 11:02:59,749.749 2829:trainer.py:487 do_train_dict(): eta: 2:15:38 iter: 61800 speed: 268.2 images/sec total_norm: 148.8605 (152.0960) loss: 138.7571 (139.6871) masked_loss: 1.4104 (1.4105) tag_loss: 137.4981 (138.2766) time: 1.4327 (1.9089) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4277 (1.9037) save_time: 8.8421 (14.2643) lr: 0.000007 max mem: 26307 2022-03-17 11:03:00,109.109 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6944444179534912 2022-03-17 11:03:00,110.110 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 115.57704162597656 2022-03-17 11:03:00,110.110 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.82539304200205 2022-03-17 11:03:30,800.800 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02395489253103733 2022-03-17 11:03:30,801.801 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 11:03:30,801.801 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'an', 'instructional', 'sign', 'is', 'placed', '[MASK]', 'a', 'fence', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 11:03:30,817.817 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['letter', 'sign', 'pole', 'fence', 'grass', 'ground', 'post', 'sky', 'building', 'tree', 'mountain', 'red', 'cloud', 'stop', 'wire', '[UNK]', 'parking', 'road', 'roof', 'bolt', 'hill', 'bush', 'number', 'arrow', 'car', 'water', 'lot', 'line', 'circle', 'chain', 'sand', 'area', 'lettering', 'dirt', 'screw', 'field', 'flower', 'word', 'wood', 'white', 'window', 'rock', 'close', 'box', 'truck', 'wall', 'paint', 'power', 'side', 'top'] 2022-03-17 11:03:46,737.737 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['building', 'ground', 'post', 'mountain', 'letter', 'sign', 'sky', 'boat', 'roof', 'grass', 'cloud', 'pole', 'dirt', 'wire', 'fence', 'instructional'] 2022-03-17 11:06:10,506.506 2829:trainer.py:487 do_train_dict(): eta: 2:12:43 iter: 61900 speed: 268.4 images/sec total_norm: 149.0126 (153.1112) loss: 137.2758 (138.4624) masked_loss: 1.3879 (1.4212) tag_loss: 135.9042 (137.0412) time: 1.4311 (1.9076) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4261 (1.9025) save_time: 8.8421 (14.2643) lr: 0.000007 max mem: 26307 2022-03-17 11:06:10,872.872 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5428571701049805 2022-03-17 11:06:10,872.872 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 153.95523071289062 2022-03-17 11:06:10,872.872 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.8274375361781 2022-03-17 11:06:41,899.899 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023952605202794075 2022-03-17 11:06:41,900.900 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 11:06:41,900.900 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'couple', 'looking', 'into', 'each', '[MASK]', 'eyes', 'on', '[MASK]', 'bench', 'in', 'a', 'grassy', 'field', '.', '##ize', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 11:06:41,916.916 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hair', 'shirt', 'bench', 'hand', 'tree', 'man', 'leg', 'grass', 'woman', 'head', 'person', '[UNK]', 'girl', 'jean', 'park', 'couple', 'arm', 'short', 'face', 'baseball', 'bird', 'boy', 'watch', 'foot', 'bracelet', 'sweater', 'bush', 'shoe', 'ground', 'photo', 'top', 'seat', 'ball', 'young', 'tank', 'plant', 'back', 'necklace', 'other', 'post', 'group', 'dress', 'pole', 'glasses', 'bat', 'flower', 'wooden', 'nose', 'field', 'cup'] 2022-03-17 11:06:57,854.854 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'park', 'woman', 'field', 'hair', 'girl', 'person', 'seat', 'couple', 'tree', 'baseball', 'jean', 'shirt', 'leg', 'grass', 'bench', 'bracelet', 'grassy'] 2022-03-17 11:09:21,580.580 2829:trainer.py:487 do_train_dict(): eta: 2:09:48 iter: 62000 speed: 268.0 images/sec total_norm: 148.6911 (151.4949) loss: 135.5490 (137.3901) masked_loss: 1.3895 (1.4084) tag_loss: 134.0743 (135.9817) time: 1.4311 (1.9108) data: 0.0001 (0.0005) to_device: 0.0052 (0.0051) time_gpu: 1.4257 (1.9052) save_time: 8.8421 (14.2643) lr: 0.000007 max mem: 26307 2022-03-17 11:09:21,942.942 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5625 2022-03-17 11:09:21,942.942 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 167.65245056152344 2022-03-17 11:09:21,943.943 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.82623415715068 2022-03-17 11:09:52,935.935 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02395820990204811 2022-03-17 11:09:52,936.936 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 11:09:52,936.936 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'black', 'and', 'white', 'cat', '[MASK]', 'behind', 'a', 'screen', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 11:09:52,952.952 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['eye', 'nose', 'head', 'cat', 'mouth', 'face', '[UNK]', 'ear', 'dog', 'leg', 'animal', 'wall', 'white', 'man', 'arm', 'hand', 'person', 'black', 'fence', 'screen', 'hair', 'foot', 'light', 'collar', 'paw', 'picture', 'reflection', 'shadow', 'body', 'neck', 'tongue', 'shirt', 'tie', 'dress', 'something', 'mesh', 'stripe', 'bow', 'cage', 'back', 'image', 'dark', 'photo', 'woman', 'front', 'camera', 'tile', 'tail', 'fur', 'floor'] 2022-03-17 11:10:08,863.863 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'black', 'white', 'mouth', 'wall', 'eye', 'screen', 'dog', 'animal', 'leg', 'nose', 'ear', 'cat'] 2022-03-17 11:12:32,492.492 2829:trainer.py:487 do_train_dict(): eta: 2:06:52 iter: 62100 speed: 268.2 images/sec total_norm: 148.8828 (150.4552) loss: 136.4514 (138.0297) masked_loss: 1.3424 (1.3971) tag_loss: 135.1617 (136.6326) time: 1.4307 (1.9091) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4255 (1.9039) save_time: 8.8421 (14.2643) lr: 0.000006 max mem: 26307 2022-03-17 11:12:32,852.852 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5151515007019043 2022-03-17 11:12:32,852.852 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 149.12570190429688 2022-03-17 11:12:32,852.852 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.82594158871764 2022-03-17 11:13:03,785.785 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02396056056022644 2022-03-17 11:13:03,786.786 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 11:13:03,786.786 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'girls', 'legs', 'wearing', 'a', 'pair', 'of', '[MASK]', 'shoes', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 11:13:03,801.801 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shoe', 'leg', 'brick', 'ground', 'person', 'sock', 'bench', '[UNK]', 'short', 'shadow', 'foot', 'arm', 'sidewalk', 'woman', 'design', 'man', 'top', 'white', 'wheel', 'jean', 'wall', 'shirt', 'heel', 'red', 'boy', 'bag', 'trash', 'bolt', 'black', 'logo', 'knee', 'front', 'base', 'skirt', 'next', 'head', 'stripe', 'back', 'street', 'hand', 'pair', 'wooden', 'flower', 'jacket', 'handle', 'can', 'dress', 'chair', 'stone', 'label'] 2022-03-17 11:13:19,773.773 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'little', 'red', 'ground', 'person', 'arm', 'girls', 'jean', 'pair', 'leg', 'bag', 'brick', 'bench', 'shoe', 'windshield', 'sock'] 03-17 11:13:31.474 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-17 11:13:31.475 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-17 11:13:32.453 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-17 11:15:43,632.632 2829:trainer.py:487 do_train_dict(): eta: 2:03:57 iter: 62200 speed: 267.9 images/sec total_norm: 147.6249 (149.1771) loss: 136.6075 (135.7381) masked_loss: 1.3502 (1.3613) tag_loss: 135.5584 (134.3768) time: 1.4313 (1.9115) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4263 (1.9064) save_time: 8.8421 (14.2643) lr: 0.000006 max mem: 26307 2022-03-17 11:15:43,993.993 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5 2022-03-17 11:15:43,993.993 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 129.79275512695312 2022-03-17 11:15:43,993.993 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.8344291760489 2022-03-17 11:16:14,910.910 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023973818868398666 2022-03-17 11:16:14,910.910 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 11:16:14,911.911 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'some', 'horses', 'standing', 'by', 'a', 'trailer', 'in', '[MASK]', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 11:16:14,926.926 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['snow', 'tree', 'tail', 'leg', 'horse', 'bus', 'ground', 'track', 'window', 'sky', 'tire', 'shadow', 'head', 'roof', '[UNK]', 'car', 'hat', 'person', 'door', 'cart', 'building', 'man', 'vehicle', 'blanket', 'harness', 'train', 'saddle', 'face', 'wood', 'snowy', 'paw', 'sign', 'covered', 'shirt', 'wheel', 'next', 'ear', 'trailer', 'truck', 'pole', 'light', 'mane', 'cloud', 'road', 'hair', 'jean', 'stick', 'leash', 'post', 'bench'] 2022-03-17 11:16:30,767.767 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'number', 'door', 'car', 'ground', 'track', 'window', 'tree', 'wood', 'horse', 'sky', 'shirt', 'bus', 'leg', 'roof', 'snow', 'tail', 'hat', 'pole', 'trailer', 'tire', 'harness'] 2022-03-17 11:18:54,664.664 2829:trainer.py:487 do_train_dict(): eta: 2:01:01 iter: 62300 speed: 268.0 images/sec total_norm: 149.0175 (150.6475) loss: 138.3447 (140.0259) masked_loss: 1.3720 (1.3775) tag_loss: 137.0161 (138.6484) time: 1.4313 (1.9103) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4264 (1.9051) save_time: 8.8421 (14.2643) lr: 0.000006 max mem: 26307 2022-03-17 11:18:55,024.024 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6363636255264282 2022-03-17 11:18:55,025.025 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 100.81631469726562 2022-03-17 11:18:55,025.025 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.84540224686647 2022-03-17 11:19:26,342.342 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.023985836654901505 2022-03-17 11:19:26,343.343 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 11:19:26,343.343 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'is', '[MASK]', 'a', 'snow', '##board', 'wince', 'a', 'hill', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 11:19:26,359.359 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['snow', 'sky', '[UNK]', 'person', 'man', 'jacket', 'mountain', 'ground', 'ski', 'leg', 'hill', 'pole', 'hat', 'track', 'backpack', 'arm', 'shadow', 'shirt', 'helmet', 'rock', 'coat', 'head', 'skier', 'group', 'tree', 'cloud', 'board', 'glove', 'mound', 'top', 'snowy', 'ramp', 'pile', 'hand', 'slope', 'foot', 'hair', 'sign', 'boot', 'face', 'logo', 'fence', 'building', 'flag', 'line', 'couple', 'bunch', 'air', 'bag', 'camera'] 2022-03-17 11:19:42,346.346 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'ground', 'rock', 'board', 'person', 'arm', 'hill', 'mountain', 'sky', 'leg', 'clothes', 'snow', 'coat', 'pole', 'jacket', 'ski', 'helmet'] 2022-03-17 11:22:05,883.883 2829:trainer.py:487 do_train_dict(): eta: 1:58:06 iter: 62400 speed: 267.8 images/sec total_norm: 146.9800 (150.2008) loss: 135.2062 (137.2163) masked_loss: 1.4445 (1.4491) tag_loss: 133.9010 (135.7672) time: 1.4319 (1.9122) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4266 (1.9070) save_time: 8.8421 (14.2643) lr: 0.000006 max mem: 26307 2022-03-17 11:22:06,248.248 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6060606241226196 2022-03-17 11:22:06,248.248 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 170.20883178710938 2022-03-17 11:22:06,249.249 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.84179468383789 2022-03-17 11:22:37,294.294 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02401658520102501 2022-03-17 11:22:37,294.294 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 11:22:37,295.295 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'cannons', 'bunch', 'of', 'people', '[MASK]', 'with', 'a', 'ball', 'on', 'a', 'field', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 11:22:37,310.310 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['car', 'grass', 'shirt', 'man', 'tree', 'short', 'hand', '[UNK]', 'boy', 'building', 'stripe', 'hat', 'person', 'hair', 'lot', 'parking', 'van', 'ground', 'field', 'shoe', 'shadow', 'cap', 'suv', 'park', 'arm', 'window', 'roof', 'house', 'watch', 'sky', 'sock', 'group', 'woman', 'head', 'fence', 'truck', 'leg', 'game', 'vehicle', 'cone', 'sunglasses', 'uniform', 'air', 'girl', 'young', 'foot', 'tire', 'line', 'jersey', 'bush'] 2022-03-17 11:22:53,263.263 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'hand', 'building', 'park', 'short', 'car', 'field', 'ground', 'hair', 'girl', 'person', 'lot', 'arm', 'boy', 'foot', 'window', 'tree', 'watch', 'ball', 'jean', 'shirt', 'vehicle', 'shadow', 'grass', 'parking', 'hat', 'cap', 'jacket', 'fence', 'bunch', 'shoe', 'suv', 'stripe'] 2022-03-17 11:25:17,117.117 2829:trainer.py:487 do_train_dict(): eta: 1:55:10 iter: 62500 speed: 267.7 images/sec total_norm: 149.1545 (153.3212) loss: 138.8012 (139.8543) masked_loss: 1.4600 (1.4645) tag_loss: 137.1537 (138.3898) time: 1.4312 (1.9124) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4263 (1.9073) save_time: 8.8421 (14.2643) lr: 0.000006 max mem: 26307 2022-03-17 11:25:17,478.478 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6176470518112183 2022-03-17 11:25:17,479.479 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 153.95199584960938 2022-03-17 11:25:17,479.479 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.8436902140657 2022-03-17 11:25:48,950.950 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.024019574746489525 2022-03-17 11:25:48,950.950 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 11:25:48,950.950 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'people', 'wearing', 'hats', 'that', 'double', 'as', '[MASK]', '##s', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 11:25:48,966.966 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'glasses', 'woman', 'face', 'umbrella', 'hair', 'head', 'wall', '[UNK]', 'table', 'hand', 'design', 'smile', 'person', 'nose', 'eye', 'door', 'man', 'picture', 'chair', 'window', 'cup', 'glass', 'bowl', 'arm', 'button', 'lady', 'boy', 'hat', 'food', 'plate', 'cabinet', 'couple', 'paper', 'bottle', 'girl', 'jacket', 'kitchen', 'box', 'shelf', 'rack', 'light', 'knife', 'handle', 'floor', 'can', 'pot', 'watch', 'book', 'bracelet'] 2022-03-17 11:26:04,919.919 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'face', 'door', 'woman', 'cup', 'hair', 'design', 'person', 'table', 'wall', 'smile', 'glass', 'eye', 'shirt', 'nose', 'bowl', 'cabinet', 'hat', 'pan', 'glasses', 'pot', 'shelf', 'container', 'lid', 'umbrella', 'stove', 'knob', 'microwave'] 2022-03-17 11:28:28,433.433 2829:trainer.py:487 do_train_dict(): eta: 1:52:15 iter: 62600 speed: 267.6 images/sec total_norm: 148.2464 (151.1456) loss: 139.1720 (138.5533) masked_loss: 1.3828 (1.4162) tag_loss: 137.7024 (137.1371) time: 1.4314 (1.9131) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4263 (1.9080) save_time: 8.8421 (14.2643) lr: 0.000006 max mem: 26307 2022-03-17 11:28:28,793.793 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6969696879386902 2022-03-17 11:28:28,793.793 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 139.42344665527344 2022-03-17 11:28:28,793.793 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.8429318234871 2022-03-17 11:29:00,245.245 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.024010566994547844 2022-03-17 11:29:00,246.246 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 11:29:00,246.246 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '##ammed', 'black', 'bear', 'resting', 'in', 'a', 'large', 'ham', '##mo', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 11:29:00,261.261 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['bear', 'chain', 'nose', 'head', 'snout', 'eye', 'ear', 'paw', 'face', 'tree', 'bag', 'black', 'brown', 'bolt', 'animal', 'muzzle', 'mouth', 'large', 'trunk', 'rope', 'buckle', '[UNK]', 'next', 'bucket', 'building', 'leg', 'ground', 'claw', 'wall', 'foot', 'gear', 'front', 'strap', 'purse', 'pocket', 'wooden', 'horse', 'basket', 'elephant', 'rock', 'leather', 'log', 'grass', 'man', 'teddy', 'dog', 'barrel', 'screw', 'structure', 'bush'] 2022-03-17 11:29:16,231.231 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'face', 'black', 'building', 'large', 'eye', 'baby', 'tree', 'dog', 'nose', 'bag', 'ear', 'bear', 'chain', 'brick', 'gear', 'resting', 'barrel', 'bolt', 'snout', 'paw'] 2022-03-17 11:31:39,446.446 2829:trainer.py:487 do_train_dict(): eta: 1:49:19 iter: 62700 speed: 268.0 images/sec total_norm: 147.6171 (149.7025) loss: 137.2416 (139.2092) masked_loss: 1.4862 (1.4644) tag_loss: 136.0555 (137.7448) time: 1.4295 (1.9102) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4243 (1.9050) save_time: 8.8421 (14.2643) lr: 0.000006 max mem: 26307 2022-03-17 11:31:39,807.807 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5428571701049805 2022-03-17 11:31:39,807.807 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 161.46360778808594 2022-03-17 11:31:39,807.807 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.84104612678479 2022-03-17 11:32:11,191.191 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02400583028793335 2022-03-17 11:32:11,191.191 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 11:32:11,191.191 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'bathroom', 'is', 'shown', 'with', 'a', 'stainless', 'steel', 'shelf', ',', '[MASK]', 'and', 'wall', '[MASK]', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 11:32:11,207.207 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'toilet', 'shelf', 'bathroom', 'paper', 'phone', 'bottle', 'lid', 'telephone', 'holder', 'shower', 'floor', 'cord', 'tile', 'door', 'knob', 'head', 'seat', '[UNK]', 'can', 'mirror', 'cabinet', 'button', 'soap', 'handle', 'roll', 'hose', 'control', 'drain', 'room', 'tank', 'light', 'brush', 'reflection', 'cap', 'small', 'dish', 'white', 'tub', 'towel', 'hair', 'bar', 'outlet', 'hand', 'sink', 'box', 'cup', 'bowl', 'rack', 'vent'] 2022-03-17 11:32:27,150.150 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'top', 'light', 'floor', 'wall', 'phone', 'paper', 'steel', 'cabinet', 'bathroom', 'bottle', 'shower', 'telephone', 'brush', 'holder', 'towel', 'basket', 'shelf', 'cord', 'toilet', 'lid', 'knob', 'stainless'] 2022-03-17 11:34:50,775.775 2829:trainer.py:487 do_train_dict(): eta: 1:46:23 iter: 62800 speed: 267.6 images/sec total_norm: 150.2934 (153.1475) loss: 138.7876 (138.4844) masked_loss: 1.3304 (1.3679) tag_loss: 137.0730 (137.1164) time: 1.4316 (1.9133) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4266 (1.9081) save_time: 8.8421 (14.2643) lr: 0.000005 max mem: 26307 2022-03-17 11:34:51,135.135 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6764705777168274 2022-03-17 11:34:51,136.136 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 126.8299560546875 2022-03-17 11:34:51,136.136 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.84651027006261 2022-03-17 11:35:22,638.638 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.024001579731702805 2022-03-17 11:35:22,638.638 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 11:35:22,639.639 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'red', 'motorcycle', '[MASK]', '[MASK]', 'a', 'large', 'warehouse', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 11:35:22,654.654 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['building', 'motorcycle', 'tire', 'wheel', 'bike', 'window', '[UNK]', 'light', 'road', 'street', 'sky', 'garage', 'door', 'sign', 'engine', 'sidewalk', 'line', 'pole', 'pipe', 'helmet', 'seat', 'tree', 'fender', 'car', 'wall', 'spoke', 'mirror', 'ground', 'man', 'curb', 'red', 'gas', 'front', 'tank', 'shadow', 'cloud', 'flag', 'black', 'jacket', 'next', 'city', 'exhaust', 'person', 'letter', 'can', 'trash', 'grass', 'rim', 'pillar', 'jean'] 2022-03-17 11:35:38,605.605 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'line', 'building', 'large', 'door', 'road', 'street', 'red', 'light', 'car', 'seat', 'engine', 'window', 'tree', 'sky', 'tank', 'wheel', 'mirror', 'cloud', 'garage', 'bike', 'pipe', 'motorcycle', 'warehouse', 'tire', 'exhaust', 'fender'] 2022-03-17 11:38:02,168.168 2829:trainer.py:487 do_train_dict(): eta: 1:43:28 iter: 62900 speed: 267.5 images/sec total_norm: 146.7174 (149.8026) loss: 142.2374 (142.9025) masked_loss: 1.4138 (1.4346) tag_loss: 140.8938 (141.4679) time: 1.4311 (1.9139) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4259 (1.9088) save_time: 8.8421 (14.2643) lr: 0.000005 max mem: 26307 2022-03-17 11:38:02,529.529 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.47058823704719543 2022-03-17 11:38:02,529.529 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 128.69232177734375 2022-03-17 11:38:02,530.530 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.85154658120776 2022-03-17 11:38:34,389.389 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02400333806872368 2022-03-17 11:38:34,390.390 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 11:38:34,390.390 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'bunch', 'of', 'elephants', 'are', 'being', '[MASK]', '##ed', 'down', 'a', '[MASK]', 'street', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 11:38:34,405.405 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'person', 'sign', 'sky', 'building', 'sidewalk', 'cloud', 'elephant', 'man', 'jacket', 'street', 'shirt', '[UNK]', 'trunk', 'pole', 'road', 'coat', 'wall', 'bag', 'city', 'fire', 'car', 'jean', 'woman', 'tail', 'hair', 'branch', 'group', 'ear', 'line', 'bridge', 'pig', 'shoe', 'animal', 'child', 'dirt', 'boy', 'window', 'stand', 'sheep', 'brick', 'curb', 'head', 'truck', 'hat', 'sack', 'can', 'block', 'shadow', 'ground'] 2022-03-17 11:38:50,402.402 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'city', 'man', 'line', 'building', 'road', 'street', 'short', 'car', 'fire', 'ground', 'hair', 'person', 'wall', 'tree', 'ball', 'sign', 'sky', 'block', 'shirt', 'truck', 'suit', 'coat', 'cloud', 'pole', 'jacket', 'bunch', 'elephant', 'sidewalk'] 2022-03-17 11:41:13,637.637 2829:trainer.py:487 do_train_dict(): eta: 1:40:32 iter: 63000 speed: 267.4 images/sec total_norm: 148.2000 (151.0592) loss: 137.4753 (140.3135) masked_loss: 1.3761 (1.3950) tag_loss: 135.7175 (138.9185) time: 1.4311 (1.9146) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4261 (1.9095) save_time: 8.8421 (14.2643) lr: 0.000005 max mem: 26307 2022-03-17 11:41:14,000.000 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.8235294222831726 2022-03-17 11:41:14,000.000 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 124.79339599609375 2022-03-17 11:41:14,000.000 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.8502342538486 2022-03-17 11:41:45,453.453 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.024008918553590775 2022-03-17 11:41:45,453.453 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 11:41:45,454.454 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'in', 'a', 'blue', 'shirt', 'and', 'apron', 'stands', 'near', 'a', 'counter', 'that', 'has', 'food', 'stacked', '[MASK]', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 11:41:45,469.469 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['man', 'hair', 'oven', 'head', 'wall', 'ear', 'shirt', 'hand', 'food', 'arm', 'grill', 'nose', '[UNK]', 'handle', 'plate', 'table', 'fire', 'pipe', 'pizza', 'face', 'wood', 'kitchen', 'stove', 'container', 'box', 'cord', 'lid', 'jean', 'tool', 'bucket', 'knife', 'pan', 'beard', 'belt', 'light', 'tray', 'stick', 'apron', 'ground', 'hose', 'door', 'dough', 'rack', 'bracelet', 'shelf', 'top', 'floor', 'something', 'bowl', 'fireplace'] 2022-03-17 11:42:01,430.430 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'hand', 'hair', 'blue', 'table', 'wall', 'food', 'chair', 'bar', 'wood', 'shirt', 'nose', 'ear', 'bowl', 'counter', 'handle', 'knife', 'pipe', 'pizza', 'cord', 'rack', 'oven', 'grill'] 03-17 11:43:32.526 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-17 11:43:32.526 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-17 11:43:33.805 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-17 11:44:25,037.037 2829:trainer.py:487 do_train_dict(): eta: 1:37:36 iter: 63100 speed: 267.5 images/sec total_norm: 148.0542 (150.9405) loss: 140.1829 (139.8030) masked_loss: 1.3559 (1.4068) tag_loss: 138.3090 (138.3962) time: 1.4317 (1.9141) data: 0.0001 (0.0005) to_device: 0.0051 (0.0050) time_gpu: 1.4265 (1.9086) save_time: 8.8421 (14.2643) lr: 0.000005 max mem: 26307 2022-03-17 11:44:25,398.398 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-17 11:44:25,398.398 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 130.31765747070312 2022-03-17 11:44:25,399.399 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.85510305211514 2022-03-17 11:44:57,218.218 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02400299161672592 2022-03-17 11:44:57,219.219 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 11:44:57,219.219 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'view', 'looking', 'out', '[MASK]', 'two', 'adjacent', '[MASK]', 'windows', 'of', 'two', 'airplanes', '[MASK]', 'pavement', 'with', 'yellow', 'lines', 'and', 'gray', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 11:44:57,235.235 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tail', 'sky', 'line', 'airplane', 'wing', 'window', 'cloud', 'runway', 'ground', 'engine', 'door', 'airport', '[UNK]', 'tree', 'nose', 'wheel', 'mirror', 'building', 'plane', 'road', 'logo', 'mountain', 'windshield', 'cockpit', 'vehicle', 'cone', 'grass', 'jet', 'pole', 'large', 'letter', 'stair', 'white', 'front', 'gate', 'propeller', 'person', 'stripe', 'tire', 'fuselage', 'sign', 'commercial', 'man', 'car', 'way', 'tower', 'name', 'truck', 'blue', 'cart'] 2022-03-17 11:45:13,186.186 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'line', 'door', 'road', 'ground', 'window', 'wing', 'tree', 'sky', 'yellow', 'nose', 'wheel', 'adjacent', 'tail', 'runway', 'airplane', 'pavement'] 2022-03-17 11:47:36,372.372 2829:trainer.py:487 do_train_dict(): eta: 1:34:40 iter: 63200 speed: 267.6 images/sec total_norm: 147.0079 (149.7055) loss: 133.6707 (134.7307) masked_loss: 1.4062 (1.3912) tag_loss: 132.0955 (133.3394) time: 1.4305 (1.9133) data: 0.0001 (0.0001) to_device: 0.0051 (0.0051) time_gpu: 1.4250 (1.9081) save_time: 8.8421 (14.2643) lr: 0.000005 max mem: 26307 2022-03-17 11:47:36,734.734 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-17 11:47:36,734.734 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 139.88140869140625 2022-03-17 11:47:36,734.734 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.85977142117035 2022-03-17 11:48:08,315.315 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.024018915370106697 2022-03-17 11:48:08,316.316 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 11:48:08,316.316 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'tennis', 'player', 'stands', 'awaiting', 'the', '[MASK]', 'expect', '##antly', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 11:48:08,331.331 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shoe', '[UNK]', 'court', 'woman', 'tennis', 'short', 'sock', 'chair', 'shirt', 'wall', 'hair', 'leg', 'hand', 'tank', 'skirt', 'ground', 'top', 'person', 'player', 'man', 'ball', 'head', 'band', 'logo', 'boy', 'letter', 'arm', 'hat', 'watch', 'handle', 'cap', 'ponytail', 'wrist', 'ear', 'girl', 'line', 'fence', 'bracelet', 'dirt', 'shadow', 'sign', 'dress', 'flower', 'sunglasses', 'glasses', 'banner', 'tree', 'uniform', 'jacket', 'outfit'] 2022-03-17 11:48:24,231.231 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'top', 'player', 'woman', 'court', 'short', 'ground', 'hair', 'person', 'wall', 'chair', 'tree', 'ball', 'letter', 'shirt', 'dog', 'leg', 'crown', 'tank', 'tennis', 'hat', 'wrist', 'banner', 'skirt', 'shoe', 'ponytail', 'sock'] 2022-03-17 11:50:48,035.035 2829:trainer.py:487 do_train_dict(): eta: 1:31:44 iter: 63300 speed: 267.1 images/sec total_norm: 148.8337 (151.8663) loss: 137.5023 (139.1452) masked_loss: 1.3838 (1.3903) tag_loss: 136.3300 (137.7550) time: 1.4323 (1.9166) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4271 (1.9114) save_time: 8.8421 (14.2643) lr: 0.000005 max mem: 26307 2022-03-17 11:50:48,396.396 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-17 11:50:48,396.396 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 129.29562377929688 2022-03-17 11:50:48,397.397 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.86152212702514 2022-03-17 11:51:20,261.261 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.0240024384111166 2022-03-17 11:51:20,261.261 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 11:51:20,262.262 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'the', 'big', 'boat', 'is', 'doing', 'down', 'the', '[MASK]', 'carrying', 'passengers', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 11:51:20,277.277 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['water', 'building', 'window', 'tree', 'boat', 'sky', 'fence', 'bridge', 'reflection', 'door', 'person', 'wall', 'light', 'river', 'stair', 'sign', 'roof', 'post', 'dock', '[UNK]', 'red', 'man', 'canal', 'bottom', 'railing', 'lamp', 'front', 'step', 'sidewalk', 'wake', 'ripple', 'city', 'grass', 'bush', 'engine', 'flag', 'cover', 'car', 'large', 'canopy', 'shirt', 'flower', 'writing', 'clock', 'tower', 'umbrella', 'arch', 'white', 'balcony', 'entrance'] 2022-03-17 11:51:36,256.256 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['man', 'water', 'building', 'river', 'door', 'light', 'big', 'hair', 'post', 'person', 'wall', 'bridge', 'cover', 'window', 'tree', 'sign', 'sky', 'shirt', 'dog', 'boat', 'wake', 'fence', 'reflection', 'lamp'] 2022-03-17 11:53:59,851.851 2829:trainer.py:487 do_train_dict(): eta: 1:28:48 iter: 63400 speed: 266.9 images/sec total_norm: 147.7166 (152.0751) loss: 139.4232 (139.4124) masked_loss: 1.4781 (1.4925) tag_loss: 137.9693 (137.9198) time: 1.4330 (1.9181) data: 0.0001 (0.0002) to_device: 0.0049 (0.0048) time_gpu: 1.4281 (1.9131) save_time: 8.8421 (14.2643) lr: 0.000005 max mem: 26307 2022-03-17 11:54:00,211.211 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.800000011920929 2022-03-17 11:54:00,212.212 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 138.64056396484375 2022-03-17 11:54:00,212.212 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.86758499145508 2022-03-17 11:54:32,134.134 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.024002449586987495 2022-03-17 11:54:32,134.134 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 11:54:32,135.135 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'staring', 'at', 'a', 'television', 'screen', 'with', 'geese', 'on', 'it', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 11:54:32,150.150 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['cat', 'television', 'bird', 'logo', 'shelf', 'wall', 'picture', 'mountain', 'screen', 'wing', 'box', 'sky', 'ear', 'grass', 'field', 'head', 'beak', 'table', 'tail', 'stand', '[UNK]', 'frame', 'book', 'duck', 'speaker', 'tv', 'bottle', 'water', 'hill', 'cloud', 'cord', 'room', 'animal', 'flat', 'wire', 'vase', 'bat', 'baseball', 'ground', 'dog', 'penguin', 'candle', 'base', 'floor', 'fireplace', 'man', 'player', 'paper', 'painting', 'airplane'] 2022-03-17 11:54:48,146.146 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'field', 'television', 'wall', 'hill', 'mountain', 'wing', 'box', 'sky', 'picture', 'screen', 'dog', 'ear', 'staring', 'bird', 'frame', 'cat', 'grass', 'tail', 'bottle', 'speaker', 'wire', 'logo', 'duck', 'shelf', 'beak', 'geese'] 2022-03-17 11:57:11,572.572 2829:trainer.py:487 do_train_dict(): eta: 1:25:52 iter: 63500 speed: 267.1 images/sec total_norm: 147.7207 (149.5208) loss: 137.8922 (138.6856) masked_loss: 1.2975 (1.3355) tag_loss: 136.5096 (137.3501) time: 1.4328 (1.9172) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4277 (1.9121) save_time: 8.8421 (14.2643) lr: 0.000004 max mem: 26307 2022-03-17 11:57:11,932.932 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7142857313156128 2022-03-17 11:57:11,932.932 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 120.93940734863281 2022-03-17 11:57:11,932.932 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.87128718993948 2022-03-17 11:57:43,686.686 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.024005835875868797 2022-03-17 11:57:43,686.686 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 11:57:43,687.687 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'sheep', 'dog', 'rounding', 'up', 'sheep', 'as', 'on', '[MASK]', '##oke', '##rs', 'watch', 'jul', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 11:57:43,702.702 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['person', 'tree', 'sheep', 'shirt', 'man', 'hat', 'ground', 'woman', 'road', 'fence', 'building', 'head', 'gravel', '[UNK]', 'leg', 'sunglasses', 'boy', 'group', 'grass', 'bench', 'trunk', 'girl', 'dog', 'hand', 'roof', 'house', 'shoe', 'jacket', 'wall', 'jean', 'bush', 'hair', 'cap', 'child', 'wool', 'skirt', 'window', 'glasses', 'leaf', 'flower', 'shadow', 'table', 'animal', 'front', 'pen', 'car', 'vest', 'coat', 'sign', 'herd'] 2022-03-17 11:57:59,678.678 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'building', 'road', 'woman', 'ground', 'person', 'child', 'boy', 'tree', 'jean', 'shirt', 'dog', 'roof', 'tail', 'hat', 'cap', 'sheep', 'fence', 'sunglasses'] 2022-03-17 12:00:23,254.254 2829:trainer.py:487 do_train_dict(): eta: 1:22:56 iter: 63600 speed: 267.1 images/sec total_norm: 149.1510 (153.5733) loss: 138.9556 (140.8472) masked_loss: 1.3446 (1.4042) tag_loss: 137.6695 (139.4431) time: 1.4320 (1.9168) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4271 (1.9117) save_time: 8.8421 (14.2643) lr: 0.000004 max mem: 26307 2022-03-17 12:00:23,616.616 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6388888955116272 2022-03-17 12:00:23,616.616 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 139.521728515625 2022-03-17 12:00:23,616.616 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.8750135879876 2022-03-17 12:00:55,399.399 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02401728555560112 2022-03-17 12:00:55,399.399 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 12:00:55,400.400 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'group', 'of', 'vehicles', 'that', '[MASK]', 'sitting', 'in', '[MASK]', 'street', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 12:00:55,415.415 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'car', 'cloud', 'light', 'pole', 'road', 'sign', 'street', 'bridge', 'line', 'traffic', 'tire', '[UNK]', 'wall', 'building', 'highway', 'grass', 'truck', 'fence', 'sidewalk', 'tree', 'window', 'parking', 'license', 'bus', 'van', 'person', 'arrow', 'curb', 'plate', 'windshield', 'lot', 'suv', 'wheel', 'intersection', 'barrier', 'tower', 'tail', 'vehicle', 'man', 'city', 'cone', 'bush', 'mirror', 'flag', 'freeway', 'busy', 'ground', 'large', 'water'] 2022-03-17 12:01:11,327.327 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'group', 'line', 'building', 'road', 'street', 'light', 'car', 'design', 'bridge', 'window', 'sign', 'sky', 'clock', 'mirror', 'cloud', 'pole', 'barrel', 'fence', 'barrier', 'sidewalk', 'tire'] 2022-03-17 12:03:35,077.077 2829:trainer.py:487 do_train_dict(): eta: 1:20:00 iter: 63700 speed: 266.9 images/sec total_norm: 148.3544 (151.4542) loss: 138.0045 (139.8757) masked_loss: 1.3450 (1.3894) tag_loss: 136.7826 (138.4863) time: 1.4313 (1.9183) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4262 (1.9132) save_time: 8.8421 (14.2643) lr: 0.000004 max mem: 26307 2022-03-17 12:03:35,438.438 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5 2022-03-17 12:03:35,438.438 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 144.75341796875 2022-03-17 12:03:35,439.439 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.87280789438086 2022-03-17 12:04:07,445.445 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.024043269455432892 2022-03-17 12:04:07,445.445 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 12:04:07,446.446 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'dog', 'is', '[MASK]', 'to', 'get', 'a', '[MASK]', '##is', '##bee', 'out', 'of', 'someone', "'", 's', 'hand', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 12:04:07,461.461 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['head', 'dog', 'floor', 'nose', '[UNK]', 'eye', 'ear', 'cat', 'plate', 'carpet', 'collar', 'shoe', 'face', 'black', 'snout', 'paw', 'leg', 'person', 'design', 'wire', 'cd', 'disc', 'cord', 'front', 'neck', 'next', 'body', 'shadow', 'wall', 'couch', 'pillow', 'table', 'foot', 'wheel', 'white', 'rug', 'button', 'top', 'back', 'light', 'ground', 'spot', 'brown', 'mouse', 'mouth', 'hand', 'chair', 'circle', 'computer', 'remote'] 2022-03-17 12:04:23,435.435 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'hand', 'someone', 'person', 'floor', 'eye', 'paper', 'foot', 'dog', 'nose', 'ear', 'cat', 'plate', 'carpet', 'shoe', 'paw'] 2022-03-17 12:06:47,128.128 2829:trainer.py:487 do_train_dict(): eta: 1:17:04 iter: 63800 speed: 266.6 images/sec total_norm: 150.0342 (152.7523) loss: 137.9230 (138.1504) masked_loss: 1.3779 (1.3900) tag_loss: 136.0630 (136.7605) time: 1.4327 (1.9205) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4274 (1.9153) save_time: 8.8421 (14.2643) lr: 0.000004 max mem: 26307 2022-03-17 12:06:47,489.489 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6285714507102966 2022-03-17 12:06:47,490.490 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 108.09188842773438 2022-03-17 12:06:47,490.490 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.88343372255424 2022-03-17 12:07:19,653.653 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.024055734276771545 2022-03-17 12:07:19,654.654 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 12:07:19,654.654 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'white', 'paint', 'holding', 'food', 'eclectic', 'includes', '[MASK]', '##cco', '##li', 'and', 'meat', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 12:07:19,670.670 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['table', 'food', 'plate', '[UNK]', 'meat', 'glass', 'potato', 'knife', 'handle', 'liquid', 'beer', 'steak', 'cup', 'vegetable', 'chicken', 'blade', 'reflection', 'design', 'carrot', 'sauce', 'fork', 'drink', 'shadow', 'fish', 'beverage', 'meal', 'cheese', 'white', 'screw', 'mushroom', 'pepper', 'onion', 'bread', 'bowl', 'next', 'bottle', 'dinner', 'green', 'piece', 'tea', 'light', 'salad', 'leaf', 'corn', 'juice', 'paper', 'breast', 'pizza', 'different', 'blue'] 2022-03-17 12:07:35,537.537 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'white', 'table', 'food', 'glass', 'handle', 'plate', 'shadow', 'beer', 'knife', 'meat', 'liquid', 'paint', 'cake', 'potato', 'steak'] 2022-03-17 12:09:59,004.004 2829:trainer.py:487 do_train_dict(): eta: 1:14:07 iter: 63900 speed: 266.8 images/sec total_norm: 147.0879 (149.2175) loss: 137.6447 (138.8817) masked_loss: 1.4027 (1.4419) tag_loss: 136.2104 (137.4398) time: 1.4327 (1.9188) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4275 (1.9136) save_time: 8.8421 (14.2643) lr: 0.000004 max mem: 26307 2022-03-17 12:09:59,366.366 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6944444179534912 2022-03-17 12:09:59,366.366 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 152.9551544189453 2022-03-17 12:09:59,366.366 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.88531295657158 2022-03-17 12:10:31,182.182 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.024061333388090134 2022-03-17 12:10:31,183.183 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 12:10:31,184.184 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'red', 'bus', 'driving', 'down', 'an', 'english', 'street', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 12:10:31,199.199 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sky', 'clock', 'tower', 'bus', 'road', 'building', 'line', 'window', 'car', 'street', 'tree', '[UNK]', 'cloud', 'wheel', 'tire', 'person', 'plate', 'license', 'castle', 'arrow', 'spire', 'sidewalk', 'sign', 'back', 'light', 'roof', 'top', 'decker', 'windshield', 'door', 'front', 'shadow', 'city', 'pole', 'fence', 'flag', 'cone', 'red', 'double', 'cross', 'man', 'truck', 'tall', 'curb', 'large', 'background', 'passenger', 'traffic', 'side', 'busy'] 2022-03-17 12:10:47,200.200 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'line', 'building', 'large', 'top', 'door', 'road', 'front', 'street', 'red', 'car', 'castle', 'window', 'tree', 'tower', 'sign', 'sky', 'bus', 'clock', 'plate', 'wheel', 'license', 'cloud', 'arrow', 'tire', 'cone', 'windshield', 'spire'] 2022-03-17 12:13:11,217.217 2829:trainer.py:487 do_train_dict(): eta: 1:11:11 iter: 64000 speed: 266.4 images/sec total_norm: 150.0655 (153.2719) loss: 137.4899 (138.0345) masked_loss: 1.4258 (1.4411) tag_loss: 135.7237 (136.5934) time: 1.4342 (1.9221) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4289 (1.9169) save_time: 8.8421 (14.2643) lr: 0.000004 max mem: 26307 2022-03-17 12:13:11,578.578 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.65625 2022-03-17 12:13:11,578.578 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 165.48831176757812 2022-03-17 12:13:11,579.579 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.88359491761872 03-17 12:13:33.905 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-17 12:13:33.905 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-17 12:13:34.585 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 0}] 2022-03-17 12:13:43,286.286 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.024044038727879524 2022-03-17 12:13:43,286.286 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 12:13:43,287.287 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'man', 'sitting', 'at', 'a', '[MASK]', '[MASK]', 'looking', 'at', 'a', 'computer', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 12:13:43,302.302 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['curtain', 'shirt', 'table', 'man', 'laptop', 'floor', 'leg', 'wall', 'hair', 'keyboard', 'screen', 'short', 'computer', '[UNK]', 'room', 'window', 'foot', 'head', 'face', 'hand', 'chair', 'beard', 'coffee', 'ear', 'door', 'television', 'glasses', 'nose', 'cup', 'cord', 'desk', 'shoe', 'picture', 'camera', 'handle', 'mouse', 'monitor', 'sock', 'phone', 'arm', 'mug', 'bed', 'rug', 'mouth', 'top', 'pillow', 'book', 'stand', 'lamp', 'front'] 2022-03-17 12:13:59,275.275 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'room', 'door', 'cup', 'short', 'television', 'hair', 'floor', 'table', 'wall', 'seat', 'chair', 'computer', 'window', 'shirt', 'kitchen', 'leg', 'roof', 'ear', 'camera', 'coat', 'deck', 'jacket', 'sink', 'monitor', 'keyboard', 'curtain', 'cord', 'laptop'] 2022-03-17 12:16:23,229.229 2829:trainer.py:487 do_train_dict(): eta: 1:08:15 iter: 64100 speed: 266.7 images/sec total_norm: 149.5569 (151.4975) loss: 139.1717 (141.6541) masked_loss: 1.3183 (1.3352) tag_loss: 137.5977 (140.3190) time: 1.4330 (1.9202) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4279 (1.9150) save_time: 8.8421 (14.2643) lr: 0.000003 max mem: 26307 2022-03-17 12:16:23,590.590 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7058823704719543 2022-03-17 12:16:23,590.590 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 141.14840698242188 2022-03-17 12:16:23,591.591 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.8877135362952 2022-03-17 12:16:55,478.478 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.024037275463342667 2022-03-17 12:16:55,479.479 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 12:16:55,479.479 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'the', 'birds', '[MASK]', 'feeding', '[MASK]', 'the', 'bird', 'feeder', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 12:16:55,494.494 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['bird', 'feeder', 'branch', 'tail', 'tree', 'seed', 'head', 'cage', 'wire', 'leaf', 'hole', 'chain', 'food', 'hook', 'wing', '[UNK]', 'feather', 'beak', 'pole', 'eye', 'basket', 'metal', 'window', 'water', 'container', 'wall', 'foot', 'dish', 'leg', 'handle', 'trunk', 'small', 'top', 'tray', 'object', 'hanging', 'mesh', 'ground', 'vine', 'light', 'plant', 'next', 'string', 'base', 'building', 'cord', 'box', 'spot', 'bean', 'holder'] 2022-03-17 12:17:11,441.441 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'food', 'tree', 'branch', 'sky', 'chain', 'bird', 'hole', 'tail', 'seed', 'pole', 'leaf', 'wire', 'cage', 'hook', 'feather', 'feeder'] 2022-03-17 12:19:35,440.440 2829:trainer.py:487 do_train_dict(): eta: 1:05:18 iter: 64200 speed: 266.4 images/sec total_norm: 147.1614 (150.7025) loss: 139.2600 (139.8896) masked_loss: 1.3733 (1.4037) tag_loss: 138.0028 (138.4859) time: 1.4330 (1.9221) data: 0.0001 (0.0005) to_device: 0.0050 (0.0049) time_gpu: 1.4278 (1.9166) save_time: 8.8421 (14.2643) lr: 0.000003 max mem: 26307 2022-03-17 12:19:35,801.801 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.59375 2022-03-17 12:19:35,802.802 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 131.42889404296875 2022-03-17 12:19:35,802.802 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.88970443583202 2022-03-17 12:20:08,078.078 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02407951094210148 2022-03-17 12:20:08,079.079 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 12:20:08,079.079 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'bunch', 'of', 'cows', 'that', 'are', '[MASK]', 'the', 'grass', '.', '1979', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 12:20:08,094.094 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['cow', 'building', 'tree', 'grass', 'window', 'roof', 'head', 'sky', 'ear', 'chimney', 'pole', 'road', 'fence', 'post', 'house', 'field', 'nose', 'face', 'cloud', 'wall', 'barn', '[UNK]', 'sign', 'herd', 'sheep', 'line', 'cattle', 'pasture', 'green', 'animal', 'calf', 'leg', 'door', 'truck', 'wire', 'wheel', 'ground', 'car', 'hill', 'tag', 'rope', 'large', 'eye', 'wood', 'collar', 'background', 'white', 'farm', 'forest', 'spot'] 2022-03-17 12:20:24,001.001 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'face', 'building', 'door', 'road', 'field', 'post', 'wall', 'window', 'tree', 'sky', 'roof', 'nose', 'ear', 'grass', 'cloud', 'pole', 'barn', 'bunch', 'cow', 'chimney'] 2022-03-17 12:22:47,563.563 2829:trainer.py:487 do_train_dict(): eta: 1:02:22 iter: 64300 speed: 266.5 images/sec total_norm: 149.3504 (153.0083) loss: 139.2202 (137.7410) masked_loss: 1.3827 (1.4195) tag_loss: 137.9611 (136.3215) time: 1.4334 (1.9212) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4282 (1.9161) save_time: 8.8421 (14.2643) lr: 0.000003 max mem: 26307 2022-03-17 12:22:47,924.924 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-17 12:22:47,924.924 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 140.81643676757812 2022-03-17 12:22:47,924.924 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.89210189795642 2022-03-17 12:23:20,020.020 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02411685697734356 2022-03-17 12:23:20,021.021 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 12:23:20,021.021 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'woman', '[MASK]', 'down', 'while', 'holding', 'a', 'black', 'umbrella', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 12:23:20,037.037 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'umbrella', 'hand', 'head', '[UNK]', 'shoe', 'woman', 'ground', 'watch', 'foot', 'bag', 'face', 'handle', 'ear', 'person', 'building', 'floor', 'scarf', 'nose', 'leg', 'jacket', 'shirt', 'arm', 'door', 'girl', 'backpack', 'pole', 'sidewalk', 'mouth', 'flop', 'glasses', 'hair', 'eye', 'flip', 'man', 'lady', 'cloth', 'rock', 'clothing', 'child', 'hood', 'strap', 'pipe', 'ledge', 'dress', 'pink', 'graffiti', 'stripe', 'brick', 'hole'] 2022-03-17 12:23:35,964.964 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'hand', 'face', 'black', 'building', 'door', 'woman', 'ground', 'floor', 'wall', 'lady', 'eye', 'foot', 'watch', 'bag', 'ear', 'chain', 'handle', 'glasses', 'towel', 'shoe', 'umbrella', 'graffiti', 'scarf'] 2022-03-17 12:25:59,877.877 2829:trainer.py:487 do_train_dict(): eta: 0:59:25 iter: 64400 speed: 266.2 images/sec total_norm: 148.6476 (150.0444) loss: 139.4954 (138.3436) masked_loss: 1.4479 (1.4636) tag_loss: 138.1433 (136.8799) time: 1.4333 (1.9232) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4281 (1.9181) save_time: 8.8421 (14.2643) lr: 0.000003 max mem: 26307 2022-03-17 12:26:00,237.237 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.65625 2022-03-17 12:26:00,237.237 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 111.07911682128906 2022-03-17 12:26:00,237.237 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.90109275995299 2022-03-17 12:26:32,329.329 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.024121245369315147 2022-03-17 12:26:32,329.329 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 12:26:32,329.329 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '##typic', 'of', 'a', 'pub', '[MASK]', 'named', 'the', 'lion', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 12:26:32,345.345 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sign', 'building', 'sky', 'light', 'pole', 'street', 'window', 'tree', 'store', 'roof', '[UNK]', 'car', 'sidewalk', 'wall', 'city', 'person', 'road', 'stop', 'traffic', 'door', 'letter', 'man', 'line', 'trash', 'can', 'fire', 'lamp', 'jacket', 'restaurant', 'arrow', 'curb', 'post', 'cloud', 'mirror', 'flag', 'corner', 'shirt', 'chimney', 'banner', 'bag', 'pipe', 'plant', 'coat', 'reflection', 'shop', 'fence', 'intersection', 'woman', 'night', 'balcony'] 2022-03-17 12:26:48,255.255 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'city', 'building', 'road', 'street', 'light', 'car', 'person', 'wall', 'view', 'paper', 'window', 'store', 'letter', 'sign', 'sky', 'coat', 'pole', 'pub', 'lamp', 'sidewalk'] 2022-03-17 12:29:12,080.080 2829:trainer.py:487 do_train_dict(): eta: 0:56:29 iter: 64500 speed: 266.4 images/sec total_norm: 147.1693 (154.0176) loss: 138.6938 (139.7664) masked_loss: 1.4062 (1.4373) tag_loss: 136.6238 (138.3291) time: 1.4319 (1.9220) data: 0.0001 (0.0001) to_device: 0.0051 (0.0050) time_gpu: 1.4267 (1.9168) save_time: 8.8421 (14.2643) lr: 0.000003 max mem: 26307 2022-03-17 12:29:12,440.440 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7777777910232544 2022-03-17 12:29:12,441.441 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 151.95896911621094 2022-03-17 12:29:12,441.441 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.89935798172611 2022-03-17 12:29:44,655.655 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02411646395921707 2022-03-17 12:29:44,655.655 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 12:29:44,656.656 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[unused516]', 'red', 'and', 'white', 'boat', 'parked', 'next', '[MASK]', 'a', 'house', 'with', 'a', 'woman', 'standing', 'next', 'to', 'a', 'dog', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 12:29:44,671.671 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['boat', 'bush', 'water', 'tree', 'reflection', 'man', 'wall', 'roof', 'person', 'shirt', 'short', 'building', 'dog', 'flag', 'woman', 'hedge', 'sky', 'grass', '[UNK]', 'hair', 'writing', 'house', 'motor', 'jacket', 'hat', 'head', 'small', 'plant', 'engine', 'bottom', 'red', 'pole', 'dock', 'post', 'rope', 'flower', 'chair', 'hand', 'window', 'door', 'lamp', 'front', 'name', 'river', 'wheel', 'boy', 'skirt', 'leg', 'car', 'white'] 2022-03-17 12:30:00,631.631 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'house', 'name', 'next', 'water', 'building', 'white', 'red', 'woman', 'short', 'hair', 'person', 'wall', 'engine', 'tree', 'sky', 'shirt', 'dog', 'boat', 'roof', 'flag', 'grass', 'bush', 'hat', 'ski', 'reflection', 'hedge'] 2022-03-17 12:32:24,435.435 2829:trainer.py:487 do_train_dict(): eta: 0:53:32 iter: 64600 speed: 266.2 images/sec total_norm: 148.8909 (152.3475) loss: 140.1122 (139.7863) masked_loss: 1.3984 (1.3970) tag_loss: 138.6306 (138.3893) time: 1.4319 (1.9236) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4267 (1.9184) save_time: 8.8421 (14.2643) lr: 0.000003 max mem: 26307 2022-03-17 12:32:24,796.796 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6666666865348816 2022-03-17 12:32:24,797.797 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 157.34005737304688 2022-03-17 12:32:24,797.797 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.89721969179982 2022-03-17 12:32:57,019.019 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.024100879207253456 2022-03-17 12:32:57,020.020 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 12:32:57,020.020 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'some', 'cartoon', 'character', '[MASK]', 'are', '[MASK]', 'in', 'this', 'photo', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 12:32:57,035.035 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'table', 'brick', 'mouth', 'pen', 'box', 'face', 'eye', 'toy', 'button', '[UNK]', 'book', 'handle', 'carrot', 'nose', 'phone', 'pencil', 'stand', 'label', 'marker', 'bottle', 'base', 'pumpkin', 'shadow', 'display', 'screen', 'block', 'holder', 'orange', 'drawer', 'top', 'cord', 'cap', 'next', 'wooden', 'bag', 'container', 'teeth', 'floor', 'antenna', 'tag', 'cloth', 'strap', 'cell', 'other', 'room', 'item', 'stack', 'cover', 'key'] 2022-03-17 12:33:12,874.874 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'mouth', 'table', 'wall', 'character', 'cover', 'stand', 'eye', 'box', 'label', 'nose', 'display', 'handle', 'brick', 'apple', 'photo', 'button', 'pen', 'item', 'toy', 'cartoon', 'marker', 'strap'] 2022-03-17 12:35:36,940.940 2829:trainer.py:487 do_train_dict(): eta: 0:50:36 iter: 64700 speed: 266.0 images/sec total_norm: 147.8625 (151.1586) loss: 137.2090 (139.5478) masked_loss: 1.3045 (1.3882) tag_loss: 135.9091 (138.1596) time: 1.4320 (1.9250) data: 0.0001 (0.0002) to_device: 0.0051 (0.0051) time_gpu: 1.4269 (1.9198) save_time: 8.8421 (14.2643) lr: 0.000003 max mem: 26307 2022-03-17 12:35:37,300.300 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-17 12:35:37,300.300 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 124.28950500488281 2022-03-17 12:35:37,301.301 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.90295072838113 2022-03-17 12:36:09,741.741 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.024104544892907143 2022-03-17 12:36:09,742.742 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 12:36:09,742.742 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'woman', 'holding', 'up', 'a', 'cell', 'phone', 'in', 'front', '[MASK]', 'her', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 12:36:09,757.757 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hair', 'head', 'shirt', 'bang', 'phone', 'hand', 'jacket', 'girl', 'man', 'woman', 'person', 'sleeve', 'eye', 'cell', 'cuff', 'wall', 'picture', '[UNK]', 'button', 'ear', 'face', 'nose', 'door', 'coat', 'camera', 'finger', 'screen', 'arm', 'green', 'young', 'light', 'blonde', 'sign', 'photo', 'pole', 'ceiling', 'chair', 'glasses', 'cup', 'ponytail', 'jean', 'cabinet', 'front', 'background', 'sweater', 'window', 'lady', 'blond', 'glass', 'building'] 2022-03-17 12:36:25,642.642 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['head', 'man', 'hand', 'face', 'front', 'woman', 'hair', 'girl', 'person', 'wall', 'phone', 'eye', 'cell', 'shirt', 'screen', 'finger', 'ear', 'coat', 'jacket', 'bang', 'sleeve', 'cuff'] 2022-03-17 12:38:49,177.177 2829:trainer.py:487 do_train_dict(): eta: 0:47:39 iter: 64800 speed: 266.3 images/sec total_norm: 149.3931 (151.4496) loss: 134.2734 (136.5747) masked_loss: 1.3836 (1.4233) tag_loss: 132.1940 (135.1515) time: 1.4319 (1.9224) data: 0.0001 (0.0002) to_device: 0.0050 (0.0049) time_gpu: 1.4266 (1.9173) save_time: 8.8421 (14.2643) lr: 0.000002 max mem: 26307 2022-03-17 12:38:49,538.538 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.47058823704719543 2022-03-17 12:38:49,538.538 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 158.97872924804688 2022-03-17 12:38:49,539.539 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.8952772319776 2022-03-17 12:39:22,172.172 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02409748174250126 2022-03-17 12:39:22,173.173 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 12:39:22,173.173 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'two', 'cats', 'sitting', 'on', 'a', 'lounge', 'chair', '[MASK]', 'looking', 'out', 'a', 'window', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 12:39:22,189.189 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['window', 'cat', 'ear', 'floor', 'car', 'head', 'leg', 'wall', 'paw', 'bench', 'couch', 'cushion', 'building', 'tail', '[UNK]', 'light', 'reflection', 'chair', 'shadow', 'black', 'arm', 'seat', 'sofa', 'room', 'pillow', 'frame', 'towel', 'bolt', 'small', 'nose', 'yellow', 'next', 'lamp', 'white', 'tire', 'carpet', 'table', 'blanket', 'wheel', 'dog', 'book', 'foot', 'snow', 'suitcase', 'bed', 'large', 'road', 'front', 'sun', 'top'] 2022-03-17 12:39:38,131.131 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'building', 'book', 'car', 'floor', 'wall', 'seat', 'arm', 'chair', 'window', 'ear', 'cat', 'tail', 'couch', 'bench', 'shelf', 'lounge', 'paw', 'cushion'] 2022-03-17 12:42:01,805.805 2829:trainer.py:487 do_train_dict(): eta: 0:44:42 iter: 64900 speed: 265.8 images/sec total_norm: 147.9228 (150.3981) loss: 137.4824 (137.6302) masked_loss: 1.3522 (1.3890) tag_loss: 136.4287 (136.2412) time: 1.4310 (1.9263) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4259 (1.9213) save_time: 8.8421 (14.2643) lr: 0.000002 max mem: 26307 2022-03-17 12:42:02,165.165 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5 2022-03-17 12:42:02,165.165 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 154.29269409179688 2022-03-17 12:42:02,165.165 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.89981867276705 2022-03-17 12:42:34,790.790 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.024111177772283554 2022-03-17 12:42:34,790.790 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 12:42:34,791.791 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'large', '[MASK]', 'in', 'short', 'brown', 'hair', 'don', '##s', 'a', '[MASK]', '[MASK]', 'girl', 'outfit', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 12:42:34,806.806 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['hand', 'skirt', 'shirt', 'wall', 'leg', 'floor', 'box', 'woman', 'belt', 'television', 'hair', 'face', 'bag', '[UNK]', 'mouth', 'man', 'arm', 'monitor', 'head', 'nose', 'ground', 'computer', 'eye', 'outlet', 'tile', 'desk', 'handle', 'picture', 'person', 'finger', 'shoe', 'dress', 'shelf', 'ceiling', 'cord', 'girl', 'book', 'room', 'frame', 'screen', 'drawer', 'cabinet', 'paper', 'table', 'stripe', 'sock', 'glasses', 'cardboard', 'light', 'stand'] 2022-03-17 12:42:50,813.813 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'school', 'head', 'man', 'hand', 'large', 'book', 'woman', 'short', 'television', 'ground', 'hair', 'girl', 'person', 'floor', 'wall', 'brown', 'smile', 'computer', 'box', 'border', 'shirt', 'picture', 'screen', 'leg', 'bag', 'desk', 'frame', 'tie', 'belt', 'stick', 'monitor', 'collar', 'skirt', 'pillow', 'outfit', 'shelf', 'drawer', 'tile'] 03-17 12:43:34.669 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-17 12:43:34.669 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-17 12:43:35.884 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 99}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-17 12:45:14,199.199 2829:trainer.py:487 do_train_dict(): eta: 0:41:46 iter: 65000 speed: 266.1 images/sec total_norm: 148.2992 (151.1839) loss: 138.9783 (139.3487) masked_loss: 1.4523 (1.4882) tag_loss: 137.3715 (137.8605) time: 1.4306 (1.9238) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4255 (1.9186) save_time: 8.8421 (14.2643) lr: 0.000002 max mem: 26307 2022-03-17 12:45:14,201.201 2829:checkpoint.py:222 save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt 2022-03-17 12:45:23,646.646 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6764705777168274 2022-03-17 12:45:23,647.647 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 148.27684020996094 2022-03-17 12:45:23,647.647 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.89644677298409 2022-03-17 12:45:56,045.045 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.024094535037875175 2022-03-17 12:45:56,045.045 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 12:45:56,046.046 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'television', 'is', 'sitting', '[MASK]', '[MASK]', 'stand', 'with', 'toys', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 12:45:56,061.061 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'handle', 'drawer', 'television', 'toy', 'cabinet', 'curtain', 'dresser', 'screen', 'shirt', 'floor', 'hat', 'blanket', 'bed', 'house', '[UNK]', 'pillow', 'bag', 'table', 'picture', 'box', 'frame', 'window', 'track', 'baby', 'clothes', 'room', 'couch', 'train', 'top', 'doll', 'horse', 'chair', 'desk', 'head', 'small', 'door', 'block', 'sheet', 'shoe', 'leg', 'cap', 'hair', 'cloth', 'stand', 'basket', 'child', 'man', 'person', 'decoration'] 2022-03-17 12:46:11,952.952 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'television', 'bed', 'wall', 'stand', 'block', 'shirt', 'picture', 'screen', 'clothes', 'bag', 'desk', 'frame', 'handle', 'cabinet', 'bow', 'sheet', 'shade', 'blanket', 'toy', 'pillow', 'lamp', 'curtain', 'drawer', 'dresser'] 2022-03-17 12:48:34,626.626 2829:trainer.py:487 do_train_dict(): eta: 0:38:49 iter: 65100 speed: 255.5 images/sec total_norm: 149.4908 (150.6414) loss: 138.9427 (139.9967) masked_loss: 1.4009 (1.4396) tag_loss: 137.5636 (138.5571) time: 1.4302 (2.0044) data: 0.0001 (0.0002) to_device: 0.0050 (0.0048) time_gpu: 1.4251 (1.9086) save_time: 8.8805 (13.8650) lr: 0.000002 max mem: 26307 2022-03-17 12:48:34,988.988 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7352941036224365 2022-03-17 12:48:34,988.988 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 130.5562744140625 2022-03-17 12:48:34,988.988 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.89891885979775 2022-03-17 12:49:07,678.678 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02409212850034237 2022-03-17 12:49:07,678.678 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 12:49:07,679.679 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'villiers', '[MASK]', 'on', 'a', 'snowy', 'surface', 'wit', 'ha', 'kite', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 12:49:07,694.694 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shadow', 'snow', 'ground', 'kite', '[UNK]', 'hand', 'coat', 'jacket', 'track', 'arm', 'head', 'hair', 'string', 'leg', 'face', 'boot', 'hat', 'girl', 'glove', 'person', 'woman', 'shoe', 'stick', 'shirt', 'ski', 'pole', 'hood', 'scarf', 'rope', 'branch', 'boy', 'tree', 'foot', 'man', 'sky', 'flower', 'backpack', 'young', 'mouth', 'leaf', 'blue', 'eye', 'sunglasses', 'glasses', 'short', 'rock', 'tail', 'wire', 'sun', 'design'] 2022-03-17 12:49:23,588.588 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'hand', 'face', 'ground', 'hair', 'girl', 'person', 'child', 'arm', 'surface', 'leg', 'snow', 'string', 'shadow', 'ha', 'coat', 'bottle', 'hat', 'flower', 'jacket', 'leaf', 'hood', 'rope', 'boot', 'shoe', 'glove', 'wit', 'kite', 'snowy'] 2022-03-17 12:51:47,051.051 2829:trainer.py:487 do_train_dict(): eta: 0:35:52 iter: 65200 speed: 266.1 images/sec total_norm: 148.1385 (152.8203) loss: 137.1125 (138.2405) masked_loss: 1.3537 (1.3640) tag_loss: 135.5274 (136.8765) time: 1.4317 (1.9242) data: 0.0001 (0.0002) to_device: 0.0052 (0.0051) time_gpu: 1.4264 (1.9190) save_time: 8.8805 (13.8650) lr: 0.000002 max mem: 26307 2022-03-17 12:51:47,411.411 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6000000238418579 2022-03-17 12:51:47,412.412 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 143.08262634277344 2022-03-17 12:51:47,412.412 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.8989073737292 2022-03-17 12:52:20,005.005 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.024089697748422623 2022-03-17 12:52:20,005.005 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 12:52:20,006.006 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'usb', 'hub', 'with', 'multiple', 'electronics', 'plug', '[MASK]', 'in', '98', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 12:52:20,021.021 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['table', 'laptop', 'keyboard', 'desk', 'ball', 'computer', 'key', 'phone', 'screen', 'cord', 'wall', 'ipod', '[UNK]', 'mouse', 'logo', 'cell', 'wheel', 'wire', 'floor', 'pen', 'printer', 'pad', 'monitor', 'paper', 'equipment', 'button', 'electronic', 'electronics', 'camera', 'case', 'device', 'knob', 'black', 'speaker', 'base', 'wallet', 'cd', 'object', 'wooden', 'other', 'antenna', 'cable', 'box', 'plug', 'circle', 'next', 'small', 'book', 'cup', 'bottle'] 2022-03-17 12:52:35,963.963 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'case', 'table', 'wall', 'phone', 'key', 'paper', 'ball', 'multiple', 'cd', 'desk', 'wheel', 'mouse', 'logo', 'keyboard', 'hub', 'cord', 'pad', 'laptop', 'printer', 'ipod'] 2022-03-17 12:54:59,700.700 2829:trainer.py:487 do_train_dict(): eta: 0:32:55 iter: 65300 speed: 265.8 images/sec total_norm: 147.7688 (149.2258) loss: 137.2031 (136.0783) masked_loss: 1.3435 (1.3949) tag_loss: 135.5815 (134.6835) time: 1.4319 (1.9265) data: 0.0001 (0.0005) to_device: 0.0049 (0.0048) time_gpu: 1.4270 (1.9212) save_time: 8.8805 (13.8650) lr: 0.000002 max mem: 26307 2022-03-17 12:55:00,060.060 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5882353186607361 2022-03-17 12:55:00,060.060 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 155.49758911132812 2022-03-17 12:55:00,061.061 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.9048068020322 2022-03-17 12:55:32,540.540 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02407936006784439 2022-03-17 12:55:32,540.540 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 12:55:32,541.541 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', '##raf', '##fe', '[MASK]', 'standing', 'in', 'a', 'grassy', 'field', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 12:55:32,556.556 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['bush', 'leg', 'tree', 'sky', 'neck', 'grass', '[UNK]', 'head', 'field', 'shadow', 'tail', 'ground', 'ear', 'cloud', 'hair', 'background', 'spot', 'horn', 'mane', 'dirt', 'standing', 'grassy', 'body', 'tall', 'open', 'area', 'next', 'plain', 'animal', 'face', 'wild', 'green', 'shrub', 'mouth', 'grazing', 'branch', 'distance', 'stand', 'front', 'dry', 'large', 'lone', 'middle', 'walking', 'day', 'adult', 'small', 'man', 'habitat', 'savannah'] 2022-03-17 12:55:48,465.465 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'field', 'ground', 'hair', 'neck', 'tree', 'sky', 'spot', 'leg', 'ear', 'shadow', 'grass', 'tail', 'bush', 'dirt', 'grassy'] 2022-03-17 12:58:12,289.289 2829:trainer.py:487 do_train_dict(): eta: 0:29:58 iter: 65400 speed: 265.9 images/sec total_norm: 148.6661 (150.4916) loss: 138.8993 (139.9551) masked_loss: 1.3772 (1.4394) tag_loss: 137.7617 (138.5157) time: 1.4331 (1.9259) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4279 (1.9208) save_time: 8.8805 (13.8650) lr: 0.000002 max mem: 26307 2022-03-17 12:58:12,649.649 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7714285850524902 2022-03-17 12:58:12,649.649 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 132.91424560546875 2022-03-17 12:58:12,649.649 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.90525168498964 2022-03-17 12:58:45,560.560 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.024068038910627365 2022-03-17 12:58:45,561.561 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 12:58:45,561.561 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'truck', 'with', 'wood', 'side', 'rails', 'in', 'the', 'back', ',', '[MASK]', 'in', 'a', 'parking', 'space', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 12:58:45,577.577 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['tree', 'truck', 'sky', 'light', 'tire', 'line', 'bumper', 'window', 'road', 'pole', 'building', 'back', 'plate', 'car', 'ground', 'license', 'fence', 'door', 'tail', 'sign', 'wheel', 'bed', 'pickup', 'street', 'handle', 'lot', 'wood', 'wire', 'mirror', 'parking', 'logo', 'wall', 'curb', 'rim', 'cloud', 'grass', 'pick', '[UNK]', 'shadow', 'old', 'power', 'roof', 'post', 'side', 'white', 'house', 'chain', 'next', 'front', 'bush'] 2022-03-17 12:59:01,563.563 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['back', 'side', 'line', 'building', 'road', 'street', 'light', 'car', 'ground', 'space', 'person', 'bed', 'wall', 'window', 'tree', 'wood', 'sky', 'truck', 'plate', 'shadow', 'wheel', 'mirror', 'brick', 'grass', 'parking', 'tail', 'license', 'pole', 'fence', 'rim', 'tire', 'curb', 'railing', 'pedal', 'bumper'] 2022-03-17 13:01:24,945.945 2829:trainer.py:487 do_train_dict(): eta: 0:27:01 iter: 65500 speed: 265.8 images/sec total_norm: 148.7471 (153.3763) loss: 137.3178 (139.4328) masked_loss: 1.4122 (1.3989) tag_loss: 135.7238 (138.0338) time: 1.4319 (1.9266) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4269 (1.9215) save_time: 8.8805 (13.8650) lr: 0.000001 max mem: 26307 2022-03-17 13:01:25,305.305 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6470588445663452 2022-03-17 13:01:25,306.306 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 98.13138580322266 2022-03-17 13:01:25,306.306 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.91387111966203 2022-03-17 13:01:58,029.029 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.024125345051288605 2022-03-17 13:01:58,029.029 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 13:01:58,030.030 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'there', 'are', 'airplanes', 'waiting', 'to', '[MASK]', 'off', '[MASK]', 'the', 'runway', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 13:01:58,045.045 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['line', 'airplane', 'tail', 'wing', 'sky', 'window', 'wheel', 'door', 'runway', 'airport', 'ground', '[UNK]', 'cloud', 'engine', 'road', 'plane', 'tree', 'logo', 'nose', 'building', 'mountain', 'large', 'stripe', 'gear', 'fuselage', 'landing', 'pole', 'tire', 'view', 'letter', 'grass', 'commercial', 'front', 'white', 'blue', 'vehicle', 'mirror', 'cone', 'jet', 'red', 'propeller', 'frame', 'way', 'fin', 'stair', 'name', 'windshield', 'small', 'side', 'truck'] 2022-03-17 13:02:13,965.965 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'line', 'door', 'road', 'ground', 'window', 'wing', 'tree', 'sky', 'nose', 'wheel', 'tail', 'runway', 'airplane'] 2022-03-17 13:04:37,724.724 2829:trainer.py:487 do_train_dict(): eta: 0:24:04 iter: 65600 speed: 265.6 images/sec total_norm: 148.8321 (152.0124) loss: 136.5320 (137.9185) masked_loss: 1.3544 (1.3954) tag_loss: 135.2360 (136.5231) time: 1.4316 (1.9277) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4263 (1.9227) save_time: 8.8805 (13.8650) lr: 0.000001 max mem: 26307 2022-03-17 13:04:38,089.089 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6176470518112183 2022-03-17 13:04:38,089.089 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 145.99453735351562 2022-03-17 13:04:38,089.089 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.91470381622082 2022-03-17 13:05:11,211.211 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.024118809029459953 2022-03-17 13:05:11,211.211 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 13:05:11,211.211 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'boats', 'in', 'a', 'river', 'with', 'tall', 'buildings', 'on', '[MASK]', '.', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 13:05:11,227.227 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['water', 'building', 'sky', 'boat', 'tree', 'beach', 'city', 'shore', 'window', 'bridge', 'cloud', 'mountain', 'background', '[UNK]', 'roof', 'pole', 'structure', 'body', 'sand', 'skyscraper', 'mast', 'ocean', 'large', 'bottom', 'umbrella', 'canopy', 'bush', 'bird', 'rock', 'dock', 'lake', 'shoreline', 'tower', 'antenna', 'flag', 'blue', 'palm', 'metal', 'hill', 'harbor', 'reflection', 'ship', 'front', 'distance', 'top', 'river', 'ball', 'sail', 'sea', 'bay'] 2022-03-17 13:05:27,157.157 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['city', 'water', 'building', 'river', 'land', 'structure', 'window', 'metal', 'tree', 'beach', 'sky', 'bottom', 'boat', 'tall', 'shore', 'dock'] 2022-03-17 13:07:50,632.632 2829:trainer.py:487 do_train_dict(): eta: 0:21:07 iter: 65700 speed: 265.4 images/sec total_norm: 149.1738 (150.6159) loss: 138.4534 (139.1638) masked_loss: 1.4229 (1.4210) tag_loss: 136.9041 (137.7428) time: 1.4317 (1.9291) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4263 (1.9240) save_time: 8.8805 (13.8650) lr: 0.000001 max mem: 26307 2022-03-17 13:07:50,993.993 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5757575631141663 2022-03-17 13:07:50,993.993 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 136.94309997558594 2022-03-17 13:07:50,993.993 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.92130625646527 2022-03-17 13:08:23,885.885 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.024174001067876816 2022-03-17 13:08:23,885.885 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 13:08:23,886.886 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'many', 'people', 'stand', '[MASK]', 'a', 'tennis', '[MASK]', 'with', 'rack', '##ets', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 13:08:23,901.901 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'man', 'short', 'shadow', 'shoe', 'tennis', '[UNK]', 'fence', 'court', 'ball', 'line', 'head', 'hand', 'hat', 'tree', 'boy', 'cap', 'ground', 'bat', 'person', 'hair', 'building', 'pole', 'group', 'sunglasses', 'bush', 'sock', 'roof', 'player', 'glasses', 'grass', 'leg', 'woman', 'arm', 'young', 'baseball', 'net', 'house', 'sign', 'handle', 'sky', 'couple', 'logo', 'light', 'face', 'jersey', 'game', 'uniform', 'watch', 'playing'] 2022-03-17 13:08:39,891.891 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'many', 'head', 'man', 'hand', 'line', 'court', 'short', 'hair', 'boy', 'tree', 'ball', 'shirt', 'tennis', 'shadow', 'hat', 'bat', 'glasses', 'fence', 'shoe', 'sunglasses', 'sock'] 2022-03-17 13:11:03,795.795 2829:trainer.py:487 do_train_dict(): eta: 0:18:10 iter: 65800 speed: 265.1 images/sec total_norm: 147.9253 (151.0860) loss: 136.0638 (136.9490) masked_loss: 1.4154 (1.4274) tag_loss: 134.6400 (135.5215) time: 1.4325 (1.9316) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4275 (1.9264) save_time: 8.8805 (13.8650) lr: 0.000001 max mem: 26307 2022-03-17 13:11:04,156.156 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6060606241226196 2022-03-17 13:11:04,156.156 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 109.13992309570312 2022-03-17 13:11:04,156.156 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.93311189990123 2022-03-17 13:11:37,237.237 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.024181626737117767 2022-03-17 13:11:37,238.238 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 13:11:37,238.238 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'close', 'up', 'of', 'a', '[MASK]', 'on', 'a', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 13:11:37,253.253 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['table', 'bottle', 'label', 'bread', 'sandwich', '[UNK]', 'wine', 'glass', 'jar', 'lid', 'food', 'cup', 'plate', 'container', 'wall', 'top', 'vegetable', 'stem', 'meat', 'book', 'pepper', 'menu', 'cheese', 'flower', 'leaf', 'tomato', 'candy', 'next', 'cherry', 'fork', 'napkin', 'can', 'cloth', 'picture', 'background', 'cap', 'spoon', 'paper', 'cookie', 'fruit', 'onion', 'lamp', 'red', 'basket', 'bowl', 'drink', 'salad', 'pole', 'chicken', 'knife'] 2022-03-17 13:11:53,152.152 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'cup', 'close', 'table', 'wall', 'food', 'glass', 'branch', 'label', 'wine', 'plate', 'bottle', 'pole', 'bread', 'fork', 'sandwich', 'container', 'lid', 'menu', 'vegetable', 'jar', 'cookie'] 03-17 13:13:35.985 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:49 cmd_run(): start to cmd run: nvidia-smi 03-17 13:13:35.985 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:51 cmd_run(): nvidia-smi 03-17 13:13:37.320 42c07f8197104c3b988e50758ff54da200000C 2827 aml_server.py:150 monitor(): [{'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29000, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}, {'mem_used': 29024, 'mem_total': 32510, 'gpu_util': 100}] 2022-03-17 13:14:17,015.015 2829:trainer.py:487 do_train_dict(): eta: 0:15:13 iter: 65900 speed: 265.0 images/sec total_norm: 148.1385 (150.4678) loss: 133.8431 (136.3024) masked_loss: 1.2963 (1.3459) tag_loss: 132.1955 (134.9565) time: 1.4322 (1.9323) data: 0.0001 (0.0001) to_device: 0.0052 (0.0051) time_gpu: 1.4269 (1.9270) save_time: 8.8805 (13.8650) lr: 0.000001 max mem: 26307 2022-03-17 13:14:17,376.376 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6060606241226196 2022-03-17 13:14:17,376.376 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 122.76383972167969 2022-03-17 13:14:17,376.376 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.941799065561 2022-03-17 13:14:50,123.123 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.024181395769119263 2022-03-17 13:14:50,123.123 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 13:14:50,123.123 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'train', 'stops', 'on', 'the', 'tracks', 'outside', 'of', 'the', 'city', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 13:14:50,139.139 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['track', 'train', 'window', 'building', 'tree', 'ground', 'light', 'pole', 'sky', 'bridge', 'roof', 'car', 'line', '[UNK]', 'front', 'door', 'station', 'platform', 'rail', 'sign', 'fence', 'engine', 'bush', 'wall', 'railing', 'stripe', 'grass', 'person', 'number', 'traffic', 'background', 'windshield', 'top', 'tracks', 'bumper', 'beam', 'passenger', 'sidewalk', 'letter', 'railroad', 'wire', 'street', 'road', 'view', 'signal', 'man', 'wheel', 'black', 'white', 'logo'] 2022-03-17 13:15:06,088.088 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'city', 'line', 'station', 'building', 'front', 'light', 'ground', 'track', 'person', 'bridge', 'window', 'train', 'tree', 'sky', 'rail', 'roof', 'bush', 'pole', 'stops', 'fence', 'chimney'] 2022-03-17 13:17:30,143.143 2829:trainer.py:487 do_train_dict(): eta: 0:12:16 iter: 66000 speed: 265.1 images/sec total_norm: 149.6373 (151.7165) loss: 135.4386 (135.5210) masked_loss: 1.3788 (1.3778) tag_loss: 133.7582 (134.1432) time: 1.4315 (1.9312) data: 0.0001 (0.0002) to_device: 0.0052 (0.0051) time_gpu: 1.4262 (1.9260) save_time: 8.8805 (13.8650) lr: 0.000001 max mem: 26307 2022-03-17 13:17:30,505.505 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.7352941036224365 2022-03-17 13:17:30,505.505 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 137.94412231445312 2022-03-17 13:17:30,506.506 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.94123101054089 2022-03-17 13:18:03,528.528 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.02417757734656334 2022-03-17 13:18:03,528.528 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 13:18:03,529.529 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', '[MASK]', 'in', 'a', 'classroom', 'eating', 'a', 'don', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 13:18:03,544.544 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['shirt', 'table', 'person', 'boy', 'hair', 'chair', 'woman', '[UNK]', 'eye', 'hand', 'wall', 'nose', 'man', 'light', 'ceiling', 'head', 'shoe', 'jean', 'window', 'stool', 'logo', 'arm', 'restaurant', 'sign', 'girl', 'ear', 'letter', 'food', 'plate', 'background', 'face', 'paper', 'poster', 'door', 'floor', 'writing', 'board', 'pizza', 'fan', 'leg', 'hat', 'shelf', 'box', 'cap', 'young', 'bag', 'bottle', 'cup', 'picture', 'child'] 2022-03-17 13:18:19,511.511 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'man', 'hand', 'book', 'door', 'light', 'woman', 'board', 'hair', 'girl', 'person', 'table', 'wall', 'boy', 'eye', 'chair', 'window', 'sign', 'jean', 'shirt', 'background', 'nose', 'restaurant', 'kid', 'logo', 'classroom', 'shoe'] 2022-03-17 13:20:43,128.128 2829:trainer.py:487 do_train_dict(): eta: 0:09:18 iter: 66100 speed: 265.3 images/sec total_norm: 147.4262 (149.1995) loss: 139.3563 (138.9289) masked_loss: 1.3532 (1.3958) tag_loss: 137.9441 (137.5331) time: 1.4318 (1.9299) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4265 (1.9247) save_time: 8.8805 (13.8650) lr: 0.000000 max mem: 26307 2022-03-17 13:20:43,489.489 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.8529411554336548 2022-03-17 13:20:43,489.489 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 113.00749969482422 2022-03-17 13:20:43,489.489 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.95179955404691 2022-03-17 13:21:16,746.746 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.024219419807195663 2022-03-17 13:21:16,746.746 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 13:21:16,747.747 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'white', 'toilet', 'sitting', 'next', 'to', 'a', '[MASK]', 'tub', 'and', 'a', 'sink', '[MASK]', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 13:21:16,762.762 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['wall', 'bathroom', '[UNK]', 'tub', 'sink', 'toilet', 'mirror', 'floor', 'shower', 'shelf', 'pipe', 'lid', 'hose', 'tank', 'handle', 'ceiling', 'seat', 'head', 'light', 'bath', 'drain', 'soap', 'cabinet', 'tile', 'white', 'dish', 'knob', 'reflection', 'door', 'small', 'outlet', 'window', 'cord', 'bowl', 'vent', 'bottle', 'brush', 'plug', 'rack', 'towel', 'paper', 'rod', 'curtain', 'cup', 'base', 'hole', 'basket', 'bar', 'holder', 'vanity'] 2022-03-17 13:21:32,692.692 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'head', 'next', 'white', 'light', 'floor', 'wall', 'seat', 'tank', 'handle', 'mirror', 'bathroom', 'bottle', 'ceiling', 'shower', 'bath', 'sink', 'brush', 'pipe', 'reflection', 'shelf', 'toilet', 'lid', 'tub', 'vent', 'hose'] 2022-03-17 13:23:56,109.109 2829:trainer.py:487 do_train_dict(): eta: 0:06:21 iter: 66200 speed: 265.3 images/sec total_norm: 148.5136 (149.5745) loss: 139.0148 (138.5556) masked_loss: 1.3990 (1.3905) tag_loss: 137.5259 (137.1651) time: 1.4320 (1.9298) data: 0.0001 (0.0002) to_device: 0.0051 (0.0049) time_gpu: 1.4269 (1.9247) save_time: 8.8805 (13.8650) lr: 0.000000 max mem: 26307 2022-03-17 13:23:56,470.470 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.5588235259056091 2022-03-17 13:23:56,470.470 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 130.36961364746094 2022-03-17 13:23:56,470.470 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.95701368398076 2022-03-17 13:24:29,726.726 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.024217350408434868 2022-03-17 13:24:29,726.726 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 13:24:29,727.727 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', '[MASK]', 'hydra', '##nts', 'on', 'a', 'side', 'walk', 'near', 'a', 'car', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 13:24:29,742.742 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['car', 'road', 'street', 'fire', '[UNK]', 'sidewalk', 'pole', 'line', 'plate', 'tree', 'license', 'curb', 'light', 'ground', 'sign', 'chain', 'mirror', 'leaf', 'base', 'building', 'person', 'top', 'tire', 'dirt', 'window', 'motorcycle', 'windshield', 'cap', 'man', 'cover', 'city', 'sky', 'trash', 'back', 'suv', 'bumper', 'paint', 'vehicle', 'side', 'lid', 'bolt', 'van', 'tail', 'traffic', 'jacket', 'rock', 'grass', 'truck', 'head', 'bike'] 2022-03-17 13:24:45,651.651 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'many', 'side', 'line', 'building', 'road', 'street', 'car', 'fire', 'ground', 'base', 'walk', 'tree', 'plate', 'license', 'pole', 'leaf', 'lid', 'sidewalk', 'tire', 'puddle'] 2022-03-17 13:27:09,520.520 2829:trainer.py:487 do_train_dict(): eta: 0:03:24 iter: 66300 speed: 264.7 images/sec total_norm: 148.9686 (151.6847) loss: 135.4109 (138.0281) masked_loss: 1.3913 (1.3951) tag_loss: 133.8510 (136.6329) time: 1.4329 (1.9341) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4278 (1.9290) save_time: 8.8805 (13.8650) lr: 0.000000 max mem: 26307 2022-03-17 13:27:09,881.881 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.6969696879386902 2022-03-17 13:27:09,881.881 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 134.40628051757812 2022-03-17 13:27:09,881.881 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.96192062906472 2022-03-17 13:27:42,822.822 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.024204757064580917 2022-03-17 13:27:42,823.823 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 13:27:42,823.823 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', 'a', 'traffic', 'sign', 'hung', 'upside', 'down', 'on', '[MASK]', 'pole', 'fuji', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 13:27:42,839.839 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['sign', 'pole', 'arrow', 'grass', 'tree', 'road', 'letter', 'sky', 'street', 'line', 'post', 'pillar', 'car', 'leaf', 'ground', 'building', 'bridge', 'water', 'trunk', 'bush', 'sidewalk', 'hill', '[UNK]', 'shadow', 'fence', 'branch', 'stop', 'window', 'traffic', 'wire', 'background', 'curb', 'roof', 'wall', 'reflection', 'light', 'white', 'column', 'man', 'graffiti', 'way', 'next', 'person', 'highway', 'intersection', 'number', 'front', 'side', 'tire', 'park'] 2022-03-17 13:27:58,824.824 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['line', 'building', 'road', 'power', 'car', 'post', 'tree', 'tower', 'letter', 'sign', 'sky', 'traffic', 'hung', 'grass', 'pole', 'leaf', 'wire', 'arrow', 'reflection'] 2022-03-17 13:30:22,704.704 2829:trainer.py:487 do_train_dict(): eta: 0:00:26 iter: 66400 speed: 265.0 images/sec total_norm: 147.0204 (150.1494) loss: 137.1686 (139.1832) masked_loss: 1.4167 (1.4033) tag_loss: 135.7120 (137.7799) time: 1.4320 (1.9318) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4267 (1.9266) save_time: 8.8805 (13.8650) lr: 0.000000 max mem: 26307 2022-03-17 13:30:23,064.064 2829:tagger_caption_uni_pipeline_expanding.py:404 forward(): caption acc = 0.625 2022-03-17 13:30:23,064.064 2829:tagger_caption_uni_pipeline_expanding.py:408 forward(): Tag Loss = 141.86703491210938 2022-03-17 13:30:23,064.064 2829:tagger_caption_uni_pipeline_expanding.py:409 forward(): Tag Precision. = 71.9596698445485 2022-03-17 13:30:56,296.296 2829:tagger_caption_uni_pipeline_expanding.py:413 forward(): Tag mAP: 0.024200977757573128 2022-03-17 13:30:56,297.297 2829:tagger_caption_uni_pipeline_expanding.py:418 forward(): # of tokens = 577 2022-03-17 13:30:56,297.297 2829:tagger_caption_uni_pipeline_expanding.py:421 forward(): Input ids sample: ['[CLS]', '[MASK]', 'man', 'standing', 'with', 'a', 'bag', 'of', 'luggage', '.', '[MASK]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]'] 2022-03-17 13:30:56,312.312 2829:tagger_caption_uni_pipeline_expanding.py:427 forward(): Sample Generation: ['person', 'shadow', 'tree', 'head', 'man', 'shirt', 'line', 'wall', 'building', 'ground', '[UNK]', 'woman', 'eye', 'leg', 'photo', 'hair', 'road', 'jacket', 'wire', 'hat', 'pole', 'mirror', 'white', 'girl', 'hand', 'face', 'nose', 'wheel', 'sky', 'umbrella', 'boat', 'car', 'flag', 'sidewalk', 'background', 'kite', 'rope', 'bush', 'ramp', 'front', 'foot', 'floor', 'reflection', 'arm', 'shoe', 'dress', 'picture', 'ear', 'black', 'air'] 2022-03-17 13:31:12,248.248 2829:tagger_caption_uni_pipeline_expanding.py:429 forward(): GT Tags: ['[UNK]', 'man', 'hand', 'building', 'ground', 'hair', 'floor', 'boy', 'foot', 'tree', 'sky', 'shirt', 'leg', 'background', 'bag', 'handle', 'shadow', 'bush', 'photo', 'pole', 'fence', 'sail', 'sidewalk', 'suitcase', 'luggage', 'hedge', 'skyscraper'] 2022-03-17 13:31:33,985.985 2829:trainer.py:487 do_train_dict(): eta: 0:00:00 iter: 66415 speed: 718.3 images/sec total_norm: 146.6914 (150.1386) loss: 137.2871 (138.2889) masked_loss: 1.4118 (1.4059) tag_loss: 135.7408 (136.8830) time: 1.4321 (1.9305) data: 0.0001 (0.0002) to_device: 0.0051 (0.0050) time_gpu: 1.4270 (1.9253) save_time: 8.8805 (13.8650) lr: 0.000000 max mem: 26307 2022-03-17 13:31:35,075.075 2829:checkpoint.py:222 save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_final.pt 2022-03-17 13:32:00,524.524 2829:checkpoint.py:222 save(): Saving checkpoint to output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt 2022-03-17 13:32:08,980.980 2829:trainer.py:525 do_train_dict(): Total training time: 1 day, 8:47:40.699242 (1.7776 s / it) 2022-03-17 13:32:09,051.051 2829:qd_common.py:625 cmd_run(): start to cmd run: zip -uyrv output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/source_code * -x \*src/CCSCaffe/\* -x \*src/build/lib.linux-x86_64-2.7/\* -x \*build/lib.linux-x86_64-2.7/\* -x \*build/temp.linux-x86_64-2.7/\* -x \*build/lib.linux-x86_64-3.5/\* -x \*build/temp.linux-x86_64-3.5/\* -x \*build/lib.linux-x86_64-3.7/\* -x assets\* -x \*build/temp.linux-x86_64-3.7/\* -x \*build/lib.linux-x86_64-3.6/\* -x \*build/temp.linux-x86_64-3.6/\* -x \*src/detectron2/datasets/\* -x \*src/CCSCaffe/models/\* -x \*src/CCSCaffe/data/\* -x \*src/CCSCaffe/examples/\* -x \*src/detectron2/output\* -x aux_data/yolo9k/\* -x visualization\* -x output\* -x data\* -x \*.build_release\* -x \*.build_debug\* -x \*.build\* -x \*tmp_run\* -x \*src/CCSCaffe/MSVC/\* -x \*.pyc -x \*.so -x \*.o -x \*src/CCSCaffe/docs/tutorial/\* -x \*src/CCSCaffe/matlab/\* -x \*.git\* -x \*src/qd/mask/modeling/captioning/coco_caption\* -x \*src/qd/mask/modeling/captioning/cider/data\* zip warning: output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/source_code.zip not found or empty adding: CLIPS.ipynb (in=24240) (out=5496) (deflated 77%) adding: README.md (in=20) (out=20) (stored 0%) adding: T5_test.ipynb (in=16824) (out=4410) (deflated 74%) adding: Untitled.ipynb (in=3434027) (out=2161291) (deflated 37%) adding: Visualization.ipynb (in=590003) (out=417444) (deflated 29%) adding: aml_job_config.json (in=4534) (out=1814) (deflated 60%) adding: aux_data/ (in=0) (out=0) (stored 0%) adding: aux_data/configs/ (in=0) (out=0) (stored 0%) adding: aux_data/configs/vigblob_account.yaml (in=596) (out=408) (deflated 32%) adding: aux_data/configs/azure_blob_account.yaml (in=163) (out=148) (deflated 9%) adding: aux_data/configs/vigstandardblob_account.yaml (in=300) (out=247) (deflated 18%) adding: aux_data/configs/others/ (in=0) (out=0) (stored 0%) adding: aux_data/configs/others/vigcancentralblob_account.yaml (in=300) (out=247) (deflated 18%) adding: aux_data/configs/others/philly_vc.yaml (in=2564) (out=823) (deflated 68%) adding: aux_data/configs/others/vigblob_account.yaml (in=595) (out=395) (deflated 34%) adding: aux_data/configs/others/reditimgblob_account.yaml (in=199) (out=170) (deflated 15%) adding: aux_data/configs/others/pengchuan.yaml (in=346) (out=281) (deflated 19%) adding: aux_data/configs/others/vigstandardblob_account.yaml (in=300) (out=247) (deflated 18%) adding: aux_data/configs/others/jfgcommentblob_account.yaml (in=217) (out=174) (deflated 20%) adding: aux_data/configs/others/build_composite_dataset.yaml (in=193) (out=134) (deflated 31%) adding: aux_data/configs/others/bingproductblob_account.yaml (in=308) (out=258) (deflated 16%) adding: aux_data/configs/others/expid_generate.yaml (in=30687) (out=4745) (deflated 85%) adding: aux_data/configs/others/cognitive_credential.yaml (in=112) (out=101) (deflated 10%) adding: aux_data/configs/others/vigeastblob_account.yaml (in=293) (out=245) (deflated 16%) adding: aux_data/configs/others/multi_philly_vc.yaml (in=31) (out=28) (deflated 10%) adding: aux_data/configs/others/vigjpeastblob_account.yaml (in=298) (out=248) (deflated 17%) adding: aux_data/configs/others/vigaueastblob_account.yaml (in=301) (out=250) (deflated 17%) adding: aux_data/configs/others/eu2blob_account.yaml.backup (in=263) (out=235) (deflated 11%) adding: aux_data/configs/others/TaxHardV1.yaml (in=1556) (out=592) (deflated 62%) adding: aux_data/configs/others/azure_blob_account.yaml.backup (in=133) (out=128) (deflated 4%) adding: aux_data/configs/others/extra_tracking_philly_jobs.yaml (in=3) (out=3) (stored 0%) adding: aux_data/configs/others/vigwestblob_account.yaml (in=605) (out=418) (deflated 31%) adding: aux_data/configs/others/vigsouthcenterblob_account.yaml (in=603) (out=419) (deflated 31%) adding: aux_data/configs/others/xiyinwestmaskblob_account.yaml (in=311) (out=255) (deflated 18%) adding: aux_data/configs/others/cogsimagestorageblob_account.yaml (in=163) (out=148) (deflated 9%) adding: aux_data/configs/others/mongodb_credential.yaml (in=52) (out=42) (deflated 19%) adding: aux_data/configs/vigeastblob_account.yaml (in=293) (out=245) (deflated 16%) adding: aux_data/configs/xiyinwestmaskblob_account.yaml (in=311) (out=255) (deflated 18%) adding: aux_data/Jacob_config/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/16-384-vitbfocal20-bert-tokenizer-expanding-distill/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/16-384-vitbfocal20-bert-tokenizer-expanding-distill/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_distill_weight_1.yaml (in=2070) (out=866) (deflated 58%) adding: aux_data/Jacob_config/16-384-vitbfocal20-bert-tokenizer-expanding-distill/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_distill_s=8_weight_1.yaml (in=2078) (out=871) (deflated 58%) adding: aux_data/Jacob_config/16-384-vitbfocal20-bert-tokenizer-expanding-distill/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_distill.yaml (in=2059) (out=865) (deflated 58%) adding: aux_data/Jacob_config/16-384-vitbfocal20-bert-tokenizer-expanding-distill/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_distill_s=8-nonewtokenizer.yaml (in=2122) (out=884) (deflated 58%) adding: aux_data/Jacob_config/16-384-vitbfocal20-bert-tokenizer-expanding-distill/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_distill_not_all_token.yaml (in=2086) (out=872) (deflated 58%) adding: aux_data/Jacob_config/16-384-vitbfocal20-bert-tokenizer-expanding-distill/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_distill_s=8.yaml (in=2092) (out=876) (deflated 58%) adding: aux_data/Jacob_config/coco_captioning/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/log.txt (in=54093) (out=3928) (deflated 93%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622519596_38814551.yaml (in=1466) (out=653) (deflated 55%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622918567_3a7ba0fc_008.yaml (in=1465) (out=646) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_Zhiyuan-PyTorch-Test_1622519564_da993a57.yaml (in=1476) (out=654) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/log_OobjectDec.txt (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623798151_a08273e7_008.yaml (in=1491) (out=662) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623117072_824f5561_008.yaml (in=1468) (out=653) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623798165_bf4a05ac_008.yaml (in=1481) (out=654) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622955881_4a8af6c7.yaml (in=1456) (out=644) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622850281_ca891676_008.yaml (in=1463) (out=644) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622918554_7eb3c00a.yaml (in=1460) (out=651) (deflated 55%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623784968_3160e6bd_008.yaml (in=1495) (out=664) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623126708_54017d63.yaml (in=1456) (out=643) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622918554_7eb3c00a.yaml (in=1460) (out=645) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622519596_38814551.yaml (in=1466) (out=648) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_20_with_VLP_Zhiyuan-PyTorch-Test_1622519596_38814551.yaml (in=1466) (out=653) (deflated 55%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622850281_ca891676.yaml (in=1454) (out=641) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_10_with_VLP_Zhiyuan-PyTorch-Test_1622519657_c6f06096.yaml (in=1456) (out=647) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1624054505_c137d6e9.yaml (in=1576) (out=680) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1624053950_aae348f1.yaml (in=1520) (out=659) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622918567_3a7ba0fc.yaml (in=1456) (out=644) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_10_with_VLP_Zhiyuan-PyTorch-Test_1622519564_da993a57.yaml (in=1476) (out=660) (deflated 55%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_Zhiyuan-PyTorch-Test_1622519657_c6f06096.yaml (in=1456) (out=642) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622918554_7eb3c00a_008.yaml (in=1469) (out=649) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623116972_4496aa14.yaml (in=1461) (out=646) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622955881_4a8af6c7_008.yaml (in=1465) (out=646) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623116931_2268c9d0,_008.yaml (in=1464) (out=644) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_10_with_VLP_80_epochs_0.08.yaml (in=1629) (out=694) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_without_VLP_Zhiyuan-PyTorch-Test.yaml (in=1250) (out=544) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_NoVLP.yaml (in=1391) (out=607) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1624090469_bf9ca5e4.yaml (in=1646) (out=705) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1624054095_f627b37c.yaml (in=1550) (out=668) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-6_iter_10_with_VLP_80_epochs_0.08.yaml (in=1629) (out=694) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1624092038_a6c5b171_0.8.yaml (in=1692) (out=715) (deflated 58%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_80_epochs_0.08.yaml (in=1629) (out=689) (deflated 58%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623905847_85c4023.yaml (in=1609) (out=692) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1624054505_c137d6e9.yaml (in=1606) (out=692) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_80_epochs_0.9.yaml (in=1618) (out=690) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1624091901_159eac17_0.9.yaml (in=1692) (out=714) (deflated 58%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_80_epochs_0.08.yaml (in=1629) (out=693) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1624092038_a6c5b171.yaml (in=1692) (out=712) (deflated 58%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1624090515_017cf63e.yaml (in=1644) (out=704) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/priori/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1624091901_159eac17.yaml (in=1692) (out=713) (deflated 58%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_Zhiyuan-PyTorch-Test_1622519596_38814551.yaml (in=1466) (out=647) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_20_with_VLP_Zhiyuan-PyTorch-Test_1622519657_c6f06096.yaml (in=1456) (out=642) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623116972_4496aa14_008.yaml (in=1468) (out=647) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622519564_da993a57.yaml (in=1476) (out=660) (deflated 55%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_20_with_VLP_Zhiyuan-PyTorch-Test_1622519564_da993a57.yaml (in=1476) (out=655) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623798711_f6a7aa89_008.yaml (in=1483) (out=655) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623126708_54017d63_008.yaml (in=1465) (out=645) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_20_with_VLP_Zhiyuan-PyTorch-Test_1622519564_da993a57.yaml (in=1476) (out=660) (deflated 55%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623117072_824f5561.yaml (in=1461) (out=651) (deflated 55%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_10_with_VLP_Zhiyuan-PyTorch-Test_1622519596_38814551.yaml (in=1466) (out=653) (deflated 55%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622918554_7eb3c00a_008.yaml (in=1469) (out=654) (deflated 55%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622519564_da993a57.yaml (in=1476) (out=655) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623116931_2268c9d0.yaml (in=1457) (out=644) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_20_with_VLP_Zhiyuan-PyTorch-Test_1622519657_c6f06096.yaml (in=1456) (out=648) (deflated 55%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623117072_824f5561.yaml (in=1461) (out=645) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622850950_2792e5e4.yaml (in=1448) (out=636) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622519657_c6f06096.yaml (in=1456) (out=642) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1623784986_6e927e3b_008.yaml (in=1485) (out=658) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622519657_c6f06096.yaml (in=1456) (out=648) (deflated 55%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_20_with_VLP_Zhiyuan-PyTorch-Test_1622519596_38814551.yaml (in=1466) (out=648) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622850950_2792e5e4_0.08.yaml (in=1457) (out=639) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/multi_scale/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_captioning/multi_scale/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_224_lr_1e-4_iter_60_without_VLP_multiscale_112_64.yaml (in=1362) (out=569) (deflated 58%) adding: aux_data/Jacob_config/coco_captioning/multi_scale/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_224_lr_2e-4_iter_60_without_VLP_multiscale_112_64.yaml (in=1360) (out=571) (deflated 58%) adding: aux_data/Jacob_config/coco_captioning/multi_scale/Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_5e-5_iter_60_without_VLP_multiscale_192_96.yaml (in=1358) (out=572) (deflated 58%) adding: aux_data/Jacob_config/coco_captioning/multi_scale/Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_60_without_VLP_multiscale_192_96_token_sample_378.yaml (in=1437) (out=589) (deflated 59%) adding: aux_data/Jacob_config/coco_captioning/multi_scale/Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_60_without_VLP_multiscale_192_96.yaml (in=1358) (out=569) (deflated 58%) adding: aux_data/Jacob_config/coco_captioning/multi_scale/others/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_captioning/multi_scale/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_60_without_VLP_multiscale_192_96.yaml (in=1414) (out=607) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/multi_scale/others/Vilt_captioning_testing_batch-size_128_encoder_vit_base_patch16_384_lr_2e-5_iter_60_without_VLP_multiscale_192_96.yaml (in=1337) (out=567) (deflated 58%) adding: aux_data/Jacob_config/coco_captioning/multi_scale/others/Vilt_captioning_testing_batch-size_128_encoder_vit_base_patch16_384_lr_5e-5_iter_60_without_VLP_multiscale_192_96.yaml (in=1337) (out=567) (deflated 58%) adding: aux_data/Jacob_config/coco_captioning/multi_scale/others/Vilt_captioning_testing_batch-size_128_encoder_vit_base_patch16_384_lr_1e-4_iter_60_without_VLP_multiscale_192_96_small_scale_0.9.yaml (in=1334) (out=564) (deflated 58%) adding: aux_data/Jacob_config/coco_captioning/multi_scale/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_224_lr_5e-5_iter_60_without_VLP_multiscale_112_64.yaml (in=1339) (out=569) (deflated 58%) adding: aux_data/Jacob_config/coco_captioning/multi_scale/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_224_lr_3e-4_iter_60_without_VLP_multiscale_112_64.yaml (in=1337) (out=567) (deflated 58%) adding: aux_data/Jacob_config/coco_captioning/multi_scale/Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_2e-4_iter_60_without_VLP_multiscale_192_96.yaml (in=1362) (out=572) (deflated 58%) adding: aux_data/Jacob_config/coco_captioning/multi_scale/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_60_without_VLP_multiscale_192_96.yaml (in=1433) (out=606) (deflated 58%) adding: aux_data/Jacob_config/coco_captioning/multi_scale/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_3e-4_iter_60_without_VLP_multiscale_192_96.yaml (in=1433) (out=608) (deflated 58%) adding: aux_data/Jacob_config/coco_captioning/others/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2545.yaml (in=977) (out=496) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_exp-n_100distill_iou_i2it2iatt.yaml (in=1114) (out=538) (deflated 52%) adding: aux_data/Jacob_config/coco_captioning/others/TaxCocoCaption_B_CapS_BS512_MaxIter0e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base63882_3aaa9.yaml (in=922) (out=449) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2462.yaml (in=1052) (out=528) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_2443.yaml (in=1058) (out=532) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2482.yaml (in=1052) (out=530) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_938.yaml (in=1000) (out=506) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_2e-5_iter_30_without_VLP.yaml (in=1191) (out=532) (deflated 55%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill_minilm_i2i_t2i.yaml (in=1125) (out=542) (deflated 52%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f.yaml (in=976) (out=496) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_965.yaml (in=989) (out=501) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill.yaml (in=1060) (out=516) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_2e-5_iter_30_with_VLP_exp2.yaml (in=1360) (out=626) (deflated 54%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_Jianfeng_Best_MiniVLM_LR5e-5.yaml (in=885) (out=478) (deflated 46%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_812.yaml (in=1045) (out=540) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_VLP_TEE_BS512_MaxIter40e_LR1e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12.yaml (in=961) (out=474) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_972.yaml (in=1062) (out=548) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_exp-n_10distill_iou_i2it2iatt.yaml (in=1112) (out=532) (deflated 52%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_976.yaml (in=1001) (out=507) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2464.yaml (in=1052) (out=528) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_minilm.yaml (in=1059) (out=519) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_845.yaml (in=1051) (out=529) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter40e_LR0.0001_exp1.yaml (in=859) (out=442) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_TaxCOCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp20.yaml (in=1041) (out=514) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_whole_align_textualhidden_visualseed_exp11.yaml (in=1166) (out=549) (deflated 53%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_whole_align_textualhidden_visualseed.yaml (in=1139) (out=542) (deflated 52%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2578.yaml (in=1056) (out=543) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_230.yaml (in=1060) (out=547) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_5737.yaml (in=991) (out=501) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_VLP_TEE_BS512_MaxIter40e_LR2e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12.yaml (in=961) (out=474) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_945.yaml (in=1064) (out=548) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_2432.yaml (in=1052) (out=532) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_sparseatt_chamfer_queue.yaml (in=1065) (out=528) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_vlp761.yaml (in=1064) (out=533) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR5e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_5737.yaml (in=991) (out=501) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp24.yaml (in=1062) (out=532) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_vlp750.yaml (in=1036) (out=529) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2484.yaml (in=1052) (out=530) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter40e_LR5e-06_WD0.05_Feff0f_Leff0f_2479.yaml (in=1058) (out=555) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_213.yaml (in=976) (out=489) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2477.yaml (in=1052) (out=530) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_60_without_VLP_scale_0.08_jianfeng.yaml (in=1386) (out=636) (deflated 54%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert.yaml (in=1049) (out=516) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_60_without_VLP_scale_0.9_jianfeng.yaml (in=1411) (out=650) (deflated 54%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_B_CapS_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base6.yaml (in=935) (out=489) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR5e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp19.yaml (in=981) (out=490) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_2e-4_iter_30_without_VLP.yaml (in=1191) (out=531) (deflated 55%) adding: aux_data/Jacob_config/coco_captioning/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter20e_LR0.00001.yaml (in=827) (out=429) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill_visualatt_learnable.yaml (in=1127) (out=537) (deflated 52%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill_10i2iatt_visrel.yaml (in=1118) (out=530) (deflated 53%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_whole_align_textualhidden_visualseed_exp12.yaml (in=1172) (out=552) (deflated 53%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_CC_VLPs_JFTEE_BS512_MaxIter40e_LR5e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN.yaml (in=967) (out=484) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp19.yaml (in=981) (out=489) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter40e_LR0.0001.yaml (in=827) (out=429) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_i2t.yaml (in=1064) (out=521) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_Jianfeng_Best_MiniVLM_LR5e-6.yaml (in=889) (out=480) (deflated 46%) adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_30_with_VLP_vlp1267.yaml (in=1398) (out=649) (deflated 54%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_exp-n_10distill_iou_i2it2iatt.yaml (in=1112) (out=537) (deflated 52%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp20.yaml (in=1024) (out=518) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill.yaml (in=1066) (out=519) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_60_without_VLP_jianfeng.yaml (in=1418) (out=652) (deflated 54%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2490.yaml (in=1052) (out=528) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR5e-06_WD0.05_Feff0f_Leff0f_Tie_ImgLN_793.yaml (in=1025) (out=529) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_OSCARb_7M_VLPs_PETER_BS512_MaxIter20e_LR1e-05_WD0.05_Fpeter_Lpeter.yaml (in=976) (out=502) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_whole_align_textualhidden.yaml (in=1127) (out=536) (deflated 52%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_1038.yaml (in=1035) (out=529) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_VLPs_TaxGoogleCC64split_MiniVLM_LR5e-6.yaml (in=875) (out=460) (deflated 47%) adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_60_without_VLP_scale_0.08.yaml (in=1184) (out=520) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR5e-06_WD0.05_Feff0f_Leff0f_815.yaml (in=1052) (out=545) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_whole_align_textualhidden_visualseed_exp18.yaml (in=1139) (out=539) (deflated 53%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_890.yaml (in=1062) (out=547) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_OSCARb_7M_VLPs_PETER_BS512_MaxIter20e_LR5e-05_WD0.05_Fpeter_Lpeter.yaml (in=976) (out=502) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_832.yaml (in=1050) (out=529) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_5737.yaml (in=991) (out=500) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_934.yaml (in=1001) (out=507) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_995.yaml (in=1059) (out=546) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_VLP_TEE_BS512_MaxIter40e_LR5e-06_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12.yaml (in=961) (out=474) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_MiniLM.yaml (in=1061) (out=523) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_977.yaml (in=1001) (out=507) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_2442.yaml (in=1059) (out=533) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_text_align.yaml (in=1097) (out=528) (deflated 52%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR5e-06_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp20.yaml (in=1024) (out=519) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_i2t.yaml (in=1064) (out=522) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter40e_LR0.0001_exp2.yaml (in=859) (out=443) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_TaxCOCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp20_noalign.yaml (in=1055) (out=512) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_vlp762.yaml (in=1064) (out=534) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter40e_LR5e-06_WD0.05_Feff0f_Leff0f_217.yaml (in=1044) (out=543) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_i2t_th.yaml (in=1089) (out=534) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_768.yaml (in=981) (out=487) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_2420.yaml (in=1060) (out=531) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_i2t_th.yaml (in=1089) (out=535) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2585.yaml (in=1060) (out=546) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_2e-4_iter_60_without_VLP.yaml (in=1197) (out=530) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill_visualatt_learnable.yaml (in=1127) (out=536) (deflated 52%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_woemb.yaml (in=1077) (out=524) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter40e_LR4e-05_WD0.05_Feff0f_Leff0f_.yaml (in=986) (out=452) (deflated 54%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_MiniLM.yaml (in=1061) (out=523) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp20 (copy).yaml (in=1026) (out=519) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_vlp752.yaml (in=1058) (out=530) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_2e-5_iter_30_with_VLP_exp3.yaml (in=1347) (out=617) (deflated 54%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_950.yaml (in=994) (out=502) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_888.yaml (in=1045) (out=539) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_minilm.yaml (in=1049) (out=516) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_CC_VLPs_JFTEE_BS512_MaxIter40e_LR1e-03_WD0.05_Feff0f_Leff0f_Tie_ImgLN.yaml (in=969) (out=487) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_2e-4_iter_60_without_VLP.yaml (in=1191) (out=531) (deflated 55%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_1029.yaml (in=1062) (out=546) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2489.yaml (in=1052) (out=530) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill_logit.yaml (in=1075) (out=523) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_2422.yaml (in=1060) (out=531) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill_10i2iatt_sinvisrel.yaml (in=1127) (out=536) (deflated 52%) adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_224_lr_2e-4_iter_60_without_VLP.yaml (in=1197) (out=530) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_vlp750.yaml (in=1036) (out=530) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/SCRATCH_CocoCaption_MiniVLM_COCO_VLP_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12.yaml (in=900) (out=451) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_VLP_TEE_BS512_MaxIter40e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12.yaml (in=961) (out=474) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2491.yaml (in=1052) (out=529) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_218.yaml (in=1047) (out=542) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_842.yaml (in=1054) (out=550) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_sparseatt.yaml (in=1045) (out=519) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_224_lr_1e-4_iter_30_without_VLP.yaml (in=1197) (out=528) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_sparseatt_chamfer.yaml (in=1049) (out=525) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_weightedfeat.yaml (in=1082) (out=524) (deflated 52%) adding: aux_data/Jacob_config/coco_captioning/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter40e_LR0.0001_exp3.yaml (in=859) (out=444) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_940.yaml (in=1001) (out=506) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt.yaml (in=1030) (out=510) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_VLPs_TaxGoogleCC64split_MiniVLM_LR5e-5.yaml (in=875) (out=460) (deflated 47%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_979.yaml (in=1001) (out=507) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_973.yaml (in=1062) (out=549) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_947.yaml (in=1062) (out=547) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp26.yaml (in=1062) (out=532) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp25.yaml (in=1062) (out=532) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill.yaml (in=1060) (out=516) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_894.yaml (in=990) (out=500) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_whole_align_textualhidden_visualseed_exp16.yaml (in=1145) (out=541) (deflated 53%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_B_CapS_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base6.yaml (in=945) (out=498) (deflated 47%) adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_30_with_VLP_exp2.yaml (in=1387) (out=642) (deflated 54%) adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_2e-5_iter_60_without_VLP.yaml (in=1191) (out=532) (deflated 55%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR5e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp20.yaml (in=1024) (out=518) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_895.yaml (in=1050) (out=542) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_231.yaml (in=1057) (out=547) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_TaxCOCO_Distill_JFTEE_BS512_MaxIter20e_LR5e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp20_noalign.yaml (in=1055) (out=514) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_793.yaml (in=1025) (out=529) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_942.yaml (in=1001) (out=507) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp22.yaml (in=1056) (out=527) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_971.yaml (in=1061) (out=548) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_whole_align_textualhidden.yaml (in=1127) (out=536) (deflated 52%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR5e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp19.yaml (in=981) (out=489) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2485.yaml (in=993) (out=503) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_vlp745.yaml (in=1037) (out=519) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_60_without_VLP.yaml (in=1197) (out=530) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR1e-03_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp20.yaml (in=1024) (out=518) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR5e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp20.yaml (in=1024) (out=519) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_841.yaml (in=1055) (out=550) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR5e-06_WD0.05_Feff0f_Leff0f.yaml (in=976) (out=496) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_TaxCOCO_Distill_JFTEE_BS512_MaxIter20e_LR5e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp20.yaml (in=1041) (out=514) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR5e-06_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp19.yaml (in=981) (out=490) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_215.yaml (in=1054) (out=549) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_889.yaml (in=1050) (out=542) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_978.yaml (in=1001) (out=507) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_815.yaml (in=1052) (out=546) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_queue128.yaml (in=1061) (out=543) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_vlp745.yaml (in=1037) (out=519) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert.yaml (in=1059) (out=520) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_979_eval.yaml (in=982) (out=445) (deflated 55%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_exp-n_100distill_iou_i2it2iatt.yaml (in=1114) (out=534) (deflated 52%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_queue1.yaml (in=1056) (out=539) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_entireseq.yaml (in=1029) (out=513) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_VLP_TEE_BS512_MaxIter40e_LR1e-06_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12.yaml (in=961) (out=475) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_970.yaml (in=1061) (out=547) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter40e_LR0.00001.yaml (in=827) (out=430) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_238.yaml (in=1059) (out=551) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp19.yaml (in=983) (out=490) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_2e-5_iter_30_with_VLP_cust298.yaml (in=1384) (out=639) (deflated 54%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill_minilm_i2i_t2i.yaml (in=1125) (out=543) (deflated 52%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_whole_align_textualhidden_visualseed_exp13.yaml (in=1172) (out=552) (deflated 53%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_queue512.yaml (in=1062) (out=542) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_824.yaml (in=1055) (out=552) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill_minilm.yaml (in=1094) (out=531) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12.yaml (in=996) (out=499) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill_minilm.yaml (in=1101) (out=534) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_975.yaml (in=1000) (out=506) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_whole_align.yaml (in=1101) (out=536) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_941.yaml (in=1001) (out=507) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_OSCARb_7M_VLPs_PETER_BS512_MaxIter20e_LR5e-06_WD0.05_Fpeter_Lpeter.yaml (in=976) (out=502) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill_10i2iatt_sinvisrel.yaml (in=1127) (out=535) (deflated 53%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp22.yaml (in=1056) (out=528) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2479.yaml (in=1058) (out=554) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_whole_align_textualhidden_visualseed.yaml (in=1129) (out=538) (deflated 52%) adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_2e-5_iter_30_with_VLP_exp1.yaml (in=1343) (out=615) (deflated 54%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_text_align.yaml (in=1097) (out=527) (deflated 52%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_B_CapS_BS512_MaxIter30e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base6.yaml (in=945) (out=498) (deflated 47%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_1020.yaml (in=1061) (out=547) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_woemb.yaml (in=1077) (out=524) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2586.yaml (in=1059) (out=545) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR5e-06_WD0.05_Feff0f_Leff0f_2479.yaml (in=1058) (out=555) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_827.yaml (in=1050) (out=529) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill_10i2iatt_visrel.yaml (in=1118) (out=530) (deflated 53%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_960.yaml (in=1041) (out=546) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_944.yaml (in=1061) (out=546) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_893.yaml (in=991) (out=500) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_2e-5_iter_30_with_VLP_vlp1267.yaml (in=1398) (out=649) (deflated 54%) adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_30_with_VLP_cust296.yaml (in=1394) (out=643) (deflated 54%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter40e_LR5e-06_WD0.05_Feff0f_Leff0f_218.yaml (in=1045) (out=543) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_weightedfeat.yaml (in=1082) (out=524) (deflated 52%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_959.yaml (in=1040) (out=547) (deflated 47%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_2421.yaml (in=1060) (out=532) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_queue64.yaml (in=1059) (out=540) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_Oscar_TEE_Cap_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12.yaml (in=973) (out=492) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_943.yaml (in=1064) (out=548) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_OSCARb_7M_VLPs_PETER_BS512_MaxIter20e_LR5e-06_WD0.05_Fpeter_Lpeter2.yaml (in=968) (out=499) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_whole_align.yaml (in=1101) (out=535) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_2458.yaml (in=1053) (out=546) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter40e_LR4e-05_WD0.05_Feff0f_Leff0f.yaml (in=976) (out=496) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter40e_LR0.0005.yaml (in=827) (out=430) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_831.yaml (in=1050) (out=528) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualvisual_tinybert_whole_align_textualhidden_visualseed_exp10.yaml (in=1155) (out=541) (deflated 53%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_VLP_TEE_BS512_MaxIter40e_LR4e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12.yaml (in=961) (out=475) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_946.yaml (in=1062) (out=547) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter40e_LR0.0001_exp6.yaml (in=859) (out=445) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_2e-5_iter_30_with_VLP_cust296.yaml (in=1394) (out=643) (deflated 54%) adding: aux_data/Jacob_config/coco_captioning/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter20e_LR0.0001.yaml (in=827) (out=429) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12_textualatt_10distill_logit.yaml (in=1075) (out=523) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_2444.yaml (in=1056) (out=532) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_2434.yaml (in=1047) (out=528) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_VLP_TEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12.yaml (in=965) (out=478) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_916.yaml (in=1064) (out=548) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_exp20.yaml (in=1026) (out=519) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_VLP_TEE_BS512_MaxIter40e_LR6e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12.yaml (in=961) (out=475) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_224_lr_1e-4_iter_60_without_VLP.yaml (in=1197) (out=528) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_844.yaml (in=991) (out=499) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_30_with_VLP_cust298.yaml (in=1384) (out=640) (deflated 54%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_Distill_JFTEE_BS512_MaxIter40e_LR2e-04_WD0.05_Feff0f_Leff0f_Tie_ImgLN_vlp752.yaml (in=1058) (out=530) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter20e_LR2e-05_WD0.05_Feff0f_Leff0f_queue1024.yaml (in=1065) (out=541) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_JFTEE_BS512_MaxIter30e_LR2e-05_WD0.05_Feff0f_Leff0f_2479.yaml (in=1058) (out=555) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_VLP_TEE_BS512_MaxIter20e_LR1e-05_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12.yaml (in=961) (out=475) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/others/CocoCaption_MiniVLM_COCO_VLP_TEE_BS512_MaxIter20e_LR5e-06_WD0.05_Feff0f_Leff0f_Tie_ImgLN_base12.yaml (in=961) (out=475) (deflated 51%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_Zhiyuan-PyTorch-Test_1622519564_da993a57.yaml (in=1813) (out=791) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_scale_0.08_Zhiyuan-PyTorch-Test_1622519564_da993a57_0.08.yaml (in=1824) (out=790) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/log.txt (in=53920) (out=3867) (deflated 93%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_scale_0.08_Zhiyuan-PyTorch-Test_1622519596_38814551.yaml (in=1814) (out=783) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_20_with_VLP_Zhiyuan-PyTorch-Test_1622519596_38814551.yaml (in=1803) (out=781) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_20_with_VLP_scale_0.08_Zhiyuan-PyTorch-Test_1622519596_38814551.yaml (in=1814) (out=785) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_20_with_VLP_Zhiyuan-PyTorch-Test_1622519657_c6f06096.yaml (in=1793) (out=776) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622519596_38814551.yaml (in=1803) (out=785) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_Zhiyuan-PyTorch-Test_1622519596_38814551.yaml (in=1803) (out=784) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622519657_c6f06096.yaml (in=1793) (out=777) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_Zhiyuan-PyTorch-Test_1622519657_c6f06096.yaml (in=1793) (out=776) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/priori/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/priori/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622519596_38814551.yaml (in=1862) (out=803) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/priori/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622519657_c6f06096.yaml (in=1852) (out=798) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/priori/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622556377_1e410b29.yaml (in=1838) (out=786) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/priori/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622519564_da993a57.yaml (in=1872) (out=806) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/priori/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622556174_f5b7f243.yaml (in=1851) (out=795) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_20_with_VLP_scale_0.08_Zhiyuan-PyTorch-Test_1622519657_c6f06096.yaml (in=1804) (out=777) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_scale_0.08_Zhiyuan-PyTorch-Test_1622519564_da993a57.yaml (in=1824) (out=790) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_20_with_VLP_scale_0.08_Zhiyuan-PyTorch-Test_1622519564_da993a57.yaml (in=1824) (out=789) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Zhiyuan-PyTorch-Test_1622519564_da993a57.yaml (in=1813) (out=789) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_scale_0.08_Zhiyuan-PyTorch-Test_1622519657_c6f06096.yaml (in=1804) (out=777) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_scale_0.08_Zhiyuan-PyTorch-Test_1622519657_c6f06096_0.08.yaml (in=1804) (out=778) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_20_with_VLP_Zhiyuan-PyTorch-Test_1622519564_da993a57.yaml (in=1813) (out=790) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/10_epoch_VLP_OSCAR_FD/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_scale_0.08_Zhiyuan-PyTorch-Test_1622519596_38814551_0.08.yaml (in=1814) (out=783) (deflated 57%) adding: aux_data/Jacob_config/coco_captioning/Vilt_VLP/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_captioning/Vilt_VLP/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter20e_LR0.00001_vlp_lr_1e-5_withaugatvlpfinetune_.yaml (in=1014) (out=508) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/Vilt_VLP/others/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_captioning/Vilt_VLP/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter20e_LR0.00001_vlp_lr_5e-5_withaugatvlpfinetune.yaml (in=957) (out=492) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/Vilt_VLP/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter20e_LR0.00001_vlp_lr_1e-4_withaugatfinetune.yaml (in=944) (out=487) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/Vilt_VLP/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter20e_LR0.00001_vlp_lr_4e-4_withaugatfinetune.yaml (in=949) (out=488) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/Vilt_VLP/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter20e_LR0.00001_vlp_lr_4e-4.yaml (in=945) (out=487) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/Vilt_VLP/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter20e_LR0.00001_vlp_lr_1e-4.yaml (in=940) (out=486) (deflated 48%) adding: aux_data/Jacob_config/coco_captioning/Vilt_VLP/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter20e_LR0.00001_vlp_lr_1e-4_withaugatvlpfinetune_.yaml (in=959) (out=493) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/Vilt_VLP/others/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter20e_LR0.00001_vlp_lr_1e-4_withaugatvlpfinetune.yaml (in=959) (out=493) (deflated 49%) adding: aux_data/Jacob_config/coco_captioning/Vilt_VLP/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter20e_LR0.00001_vlp_lr_5e-6_withaugatvlpfinetune.yaml (in=1012) (out=508) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/Vilt_VLP/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter20e_LR0.00001_vlp_lr_1e-4_withaugatvlpfinetune.yaml (in=1012) (out=507) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/Vilt_VLP/TaxCocoCaption_Captionnig_Vilt-base_BS128_MaxIter20e_LR0.00001_vlp_lr_1e-5_withaugatvlpfinetune.yaml (in=1012) (out=507) (deflated 50%) adding: aux_data/Jacob_config/coco_captioning/after_VLP/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_captioning/after_VLP/Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_5e-5_iter_20_multiscale_192_96.yaml (in=1466) (out=663) (deflated 55%) adding: aux_data/Jacob_config/coco_captioning/after_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_20_VLP-Distill.yaml (in=1404) (out=631) (deflated 55%) adding: aux_data/Jacob_config/coco_captioning/after_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1.5e-4_iter_20_VLP-Distill.yaml (in=1408) (out=629) (deflated 55%) adding: aux_data/Jacob_config/coco_captioning/after_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_20_VLP.yaml (in=1390) (out=618) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/after_VLP/Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_5e-5_iter_20_multiscale_192_96_smallscale_0.08.yaml (in=1499) (out=666) (deflated 56%) adding: aux_data/Jacob_config/coco_captioning/after_VLP/Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_20_VLP-Distill.yaml (in=1404) (out=626) (deflated 55%) adding: aux_data/Jacob_config/coco_captioning/after_VLP/Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_5e-6_iter_20_multiscale_192_96.yaml (in=1466) (out=662) (deflated 55%) adding: aux_data/Jacob_config/coco_captioning/test.yaml (in=1228) (out=597) (deflated 51%) adding: aux_data/Jacob_config/resume/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/resume/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_512_iter80_s2s_multi_scale_crop_1_2_4.yaml (in=2166) (out=815) (deflated 62%) adding: aux_data/Jacob_config/resume/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_512_iter80_s2s_multi_scale_crop_1_1_1.yaml (in=2109) (out=797) (deflated 62%) adding: aux_data/Jacob_config/resume/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale_multi_crop_gumbel_sample_0.9_10logit.yaml (in=2272) (out=838) (deflated 63%) adding: aux_data/Jacob_config/vqa_distill/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/vqa_distill/Jacob_Vilt_VQA_Distill_testing_batch-size_256_encoder_vit_base_patch16_224_lr_1e-4_iter_20_small_0.9_without_VLP_Teacher_eff0fpeter_71_1ce_1logit_0hid.yaml (in=1898) (out=824) (deflated 57%) adding: aux_data/Jacob_config/vqa_distill/Kim_Distill/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/vqa_distill/Kim_Distill/Jacob_Vilt_VQA_Distill_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_Teacher_vinvl_large_1ce_1logit.yaml (in=1711) (out=731) (deflated 57%) adding: aux_data/Jacob_config/vqa_distill/Kim_Distill/Jacob_Vilt_VQA_Distill_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_Teacher_vinvl_large_0ce_1logit.yaml (in=1711) (out=731) (deflated 57%) adding: aux_data/Jacob_config/vqa_distill/Kim_Distill/Jacob_Vilt_VQA_Distill_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_Teacher_vinvl_large_0ce_10logit.yaml (in=1714) (out=731) (deflated 57%) adding: aux_data/Jacob_config/vqa_distill/Kim_Distill/Jacob_Vilt_VQA_Distill_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_Teacher_vinvl_large_1ce_10logit.yaml (in=1714) (out=732) (deflated 57%) adding: aux_data/Jacob_config/vqa_distill/Jacob_Vilt_VQA_Distill_testing_batch-size_256_encoder_vit_base_patch16_224_lr_1e-4_iter_20_small_0.9_without_VLP_Teacher_eff0fpeter_71_1ce_0logit_10hid.yaml (in=1901) (out=825) (deflated 57%) adding: aux_data/Jacob_config/vqa_distill/Zhiyuan-PyTorch-Test_1620849955_7efc3f53_test.yaml (in=2051) (out=837) (deflated 59%) adding: aux_data/Jacob_config/vqa_distill/Jacob_Vilt_VQA_Distill_testing_batch-size_256_encoder_vit_base_patch16_224_lr_1e-4_iter_20_small_0.9_without_VLP_Teacher_eff0fpeter_71_1ce_1logit_1hid.yaml (in=1898) (out=822) (deflated 57%) adding: aux_data/Jacob_config/vqa_distill/Zhiyuan-PyTorch-Test_1620849955_7efc3f53.yaml (in=2052) (out=835) (deflated 59%) adding: aux_data/Jacob_config/vqa_distill/Zhiyuan-PyTorch-Test_1620847155_ada34def.yaml (in=2052) (out=835) (deflated 59%) adding: aux_data/Jacob_config/vqa_distill/Zhiyuan-PyTorch-Test_1620849935_34a03392.yaml (in=2052) (out=835) (deflated 59%) adding: aux_data/Jacob_config/vqa_distill/Jacob_Vilt_VQA_Distill_testing_batch-size_256_encoder_vit_base_patch16_224_lr_1e-4_iter_20_small_0.9_without_VLP_Teacher_eff0fpeter_71_1ce_0logit_0hid.yaml (in=1898) (out=823) (deflated 57%) adding: aux_data/Jacob_config/cc_captioning/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/cc_captioning/Google-CC_Vilt_captioning_testing_batch-size_256_ENC_DEC_vit_base_patch16_384_lr_1e-4_iter_120_without_VLP.yaml (in=1774) (out=715) (deflated 60%) adding: aux_data/Jacob_config/cc_captioning/Google-CC_Vilt_captioning_testing_batch-size_256_ENC_DEC_vit_base_patch16_384_lr_1e-4_iter_120_without_VLP_test.yaml (in=1792) (out=721) (deflated 60%) adding: aux_data/Jacob_config/cc_captioning/Google-CC_Vilt_captioning_testing_batch-size_16_encoder_vit_base_patch16_384_lr_1e-4_iter_120_with_VLP_distillation.yaml (in=2122) (out=901) (deflated 58%) adding: aux_data/Jacob_config/cc_captioning/Vilt_cc_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Jacob_Vilt_CC_captioning_val_testing_batch-size_512_encoder_vit_base_patch32_384_with_VLP_Zhiyuan-PyTorch-Test_1623116931_2268c9d0.yaml (in=1383) (out=568) (deflated 59%) adding: aux_data/Jacob_config/cc_captioning/vinvl_large_CC_caption_uni_batch-size_1024_lr_5e-5_iter_30.yaml (in=1163) (out=523) (deflated 55%) adding: aux_data/Jacob_config/cc_captioning/vinvl_CC_CIDER_caption_uni_batch-size_64_lr_5e-5_iter_5.yaml (in=1152) (out=526) (deflated 54%) adding: aux_data/Jacob_config/cc_captioning/Vilt_cc_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_Jacob_Vilt_CC_captioning_val_testing_batch-size_512_encoder_vit_base_patch32_384_with_VLP_Zhiyuan-PyTorch-Test_1622850950_2792e5e4 .yaml (in=1362) (out=562) (deflated 59%) adding: aux_data/Jacob_config/cc_captioning/vinvl_CC_caption_uni_batch-size_1024_lr_5e-5_iter_30.yaml (in=1107) (out=503) (deflated 55%) adding: aux_data/Jacob_config/cc_captioning/vinvl_CC_caption_uni_batch-size_512_lr_5e-5_iter_10.yaml (in=1104) (out=502) (deflated 55%) adding: aux_data/Jacob_config/cc_captioning/Google-CC_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_120_with_vlp.yaml (in=2081) (out=880) (deflated 58%) adding: aux_data/Jacob_config/cc_captioning/Google-CC_Vilt_captioning_testing_batch-size_16_encoder_vit_base_patch16_384_lr_1e-4_iter_30_without_VLP_multi_scale_2_4.yaml (in=1802) (out=712) (deflated 60%) adding: aux_data/Jacob_config/NoCaps/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/NoCaps/JACOB_CIDER_Jacob_Logit_Distill_Vilt_captioning_testing_lr_5.0e-6_iter_30_batch-size_32_ENC_DEC_patch16_384_ENC_DEC_S2S_80_VLP_tags_vitbfocal10_num_2_epoch_5_lr_3e-6_val_cbs.yaml (in=1989) (out=829) (deflated 58%) adding: aux_data/Jacob_config/NoCaps/JACOB_CIDER_Jacob_Logit_Distill_Vilt_captioning_testing_lr_5.0e-6_iter_30_batch-size_32_ENC_DEC_patch16_384_ViTCAP_NO-VLP_tags_vitbfocal10_num_2_epoch_5_lr_3e-6_val_no-cbs.yaml (in=2015) (out=799) (deflated 60%) adding: aux_data/Jacob_config/NoCaps/cider_Jacob_Logit_Distill_Vilt_captioning_testing_lr_1.0e-4_iter_30_batch-size_256_encoder_vit_base_patch16_384_without_VLP_multi_scale_fix_BS_64_Sample_276.yaml (in=1588) (out=640) (deflated 60%) adding: aux_data/Jacob_config/NoCaps/JACOB_CIDER_Jacob_Logit_Distill_Vilt_captioning_testing_lr_5.0e-6_iter_30_batch-size_32_ENC_DEC_patch16_384_ViTCAP_NO-VLP_tags_vitbfocal10_num_2_epoch_5_lr_3e-6_val_cbs.yaml (in=2038) (out=800) (deflated 61%) adding: aux_data/Jacob_config/NoCaps/JACOB_CIDER_Jacob_Logit_Distill_Vilt_captioning_testing_lr_5.0e-6_iter_30_batch-size_32_ENC_DEC_patch16_384_ENC_DEC_S2S_80_VLP_tags_vitbfocal10_num_2_epoch_5_lr_3e-6.yaml (in=1715) (out=744) (deflated 57%) adding: aux_data/Jacob_config/NoCaps/JACOB_CIDER_Jacob_Logit_Distill_Vilt_captioning_testing_lr_5.0e-6_iter_30_batch-size_32_ENC_DEC_patch16_384_ViTCAP_NO-VLP_tags_vitbfocal10_num_2_epoch_5_lr_3e-6_val_cbs_cider.yaml (in=2128) (out=889) (deflated 58%) adding: aux_data/Jacob_config/NoCaps/JACOB_CIDER_Jacob_Logit_Distill_Vilt_captioning_testing_lr_5.0e-6_iter_30_batch-size_32_ENC_DEC_patch16_384_ENC_DEC_S2S_80_VLP_tags_vitbfocal10_num_2_epoch_5_lr_3e-6_val_no-cbs.yaml (in=1990) (out=828) (deflated 58%) adding: aux_data/Jacob_config/NoCaps/JACOB_CIDER_Jacob_Logit_Distill_Vilt_captioning_testing_lr_5.0e-6_iter_30_batch-size_32_ENC_DEC_patch16_384_ENC_DEC_S2S_80_VLP_tags_vitbfocal10_num_2_epoch_5_lr_3e-6_val_cbs2.yaml (in=1989) (out=829) (deflated 58%) adding: aux_data/Jacob_config/NoCaps/JACOB_CIDER_Jacob_Logit_Distill_Vilt_captioning_testing_lr_5.0e-6_iter_30_batch-size_32_ENC_DEC_patch16_384_ENC_DEC_S2S_80_VLP_tags_vitbfocal10_num_2_epoch_5_lr_3e-6_val.yaml (in=1965) (out=818) (deflated 58%) adding: aux_data/Jacob_config/VQA/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/VQA/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_5e-4_iter_40_small_0.08_without_VLP.yaml (in=1357) (out=586) (deflated 57%) adding: aux_data/Jacob_config/VQA/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_2e-4_iter_40_small_0.8_without_VLP.yaml (in=1354) (out=586) (deflated 57%) adding: aux_data/Jacob_config/VQA/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_40_small_0.8_without_VLP.yaml (in=1360) (out=588) (deflated 57%) adding: aux_data/Jacob_config/VQA/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_40_small_0.08_without_VLP.yaml (in=1357) (out=584) (deflated 57%) adding: aux_data/Jacob_config/VQA/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_5e-5_iter_40_small_0.08_without_VLP.yaml (in=1357) (out=584) (deflated 57%) adding: aux_data/Jacob_config/VQA/VLP/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/VQA/VLP/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_40_small_0.8_without_VLP_Zhiyuan-PyTorch-Test_1622519596_38814551.yaml (in=1572) (out=689) (deflated 56%) adding: aux_data/Jacob_config/VQA/VLP/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_40_small_0.8_without_VLP_Zhiyuan-PyTorch-Test_1622556174_f5b7f243_20epoch.yaml (in=1577) (out=689) (deflated 56%) adding: aux_data/Jacob_config/VQA/VLP/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_40_small_0.8_without_VLP_Zhiyuan-PyTorch-Test_1622556377_1e410b29_20epoch.yaml (in=1564) (out=678) (deflated 57%) adding: aux_data/Jacob_config/VQA/VLP/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_40_small_0.8_without_VLP_Zhiyuan-PyTorch-Test_1622519657_c6f06096_20epoch.yaml (in=1578) (out=692) (deflated 56%) adding: aux_data/Jacob_config/VQA/VLP/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_40_small_0.8_without_VLP_Zhiyuan-PyTorch-Test_1622519564_da993a57_20epoch.yaml (in=1598) (out=701) (deflated 56%) adding: aux_data/Jacob_config/VQA/VLP/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_40_small_0.8_without_VLP_Zhiyuan-PyTorch-Test_1622556174_f5b7f243.yaml (in=1561) (out=684) (deflated 56%) adding: aux_data/Jacob_config/VQA/VLP/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_40_small_0.8_without_VLP_Zhiyuan-PyTorch-Test_1622519564_da993a57.yaml (in=1582) (out=695) (deflated 56%) adding: aux_data/Jacob_config/VQA/VLP/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_40_small_0.8_without_VLP_Zhiyuan-PyTorch-Test_1622519596_38814551_20epoch.yaml (in=1588) (out=695) (deflated 56%) adding: aux_data/Jacob_config/VQA/VLP/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_40_small_0.8_without_VLP_Zhiyuan-PyTorch-Test_1622519657_c6f06096.yaml (in=1562) (out=688) (deflated 56%) adding: aux_data/Jacob_config/VQA/VLP/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_40_small_0.8_without_VLP_Zhiyuan-PyTorch-Test_1622556377_1e410b29.yaml (in=1548) (out=673) (deflated 57%) adding: aux_data/Jacob_config/VQA/Vilt_VQA_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_2e-4_iter_40_small_0.08_without_VLP.yaml (in=1357) (out=586) (deflated 57%) adding: aux_data/Jacob_config/others/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/others/0ce_1logit.json . (in=15792897) (out=616979) (deflated 96%) adding: aux_data/Jacob_config/others/vilt/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/others/vilt/vqa_uni_pipeline.yaml (in=1713) (out=589) (deflated 66%) adding: aux_data/Jacob_config/others/vilt/vlp_vilt.yaml (in=2523) (out=818) (deflated 68%) adding: aux_data/Jacob_config/others/vilt/ignore_pattern.yaml (in=75) (out=49) (deflated 35%) adding: aux_data/Jacob_config/others/vilt/waiting_then_vilt_vqa.yaml (in=3238) (out=942) (deflated 71%) adding: aux_data/Jacob_config/others/vilt/caption_uni_pipeline.yaml (in=1494) (out=467) (deflated 69%) adding: aux_data/Jacob_config/others/vilt/vilt_vqa_uni_pipeline.yaml (in=2218) (out=713) (deflated 68%) adding: aux_data/Jacob_config/others/vilt/waiting_then_vilt_caption.yaml (in=2531) (out=751) (deflated 70%) adding: aux_data/Jacob_config/others/vilt/caption_uni_pipeline_debug.yaml (in=1491) (out=465) (deflated 69%) adding: aux_data/Jacob_config/others/caption_uni_pipeline_teacher.yaml (in=1174) (out=509) (deflated 57%) adding: aux_data/Jacob_config/others/VG-SGG-dicts-vgoi6-clipped.json (in=108739) (out=122) (deflated 100%) adding: aux_data/Jacob_config/others/my.png (in=254733) (out=254741) (deflated 0%) adding: aux_data/Jacob_config/kim_vilt/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/kim_vilt/Kim_Vilt_CC_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP.yaml (in=1351) (out=599) (deflated 56%) adding: aux_data/Jacob_config/kim_vilt/Kim_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP.yaml (in=1350) (out=592) (deflated 56%) adding: aux_data/Jacob_config/kim_vilt/Kim_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch32_384_lr_1e-5_iter_30_with_VLP.yaml (in=1353) (out=589) (deflated 56%) adding: aux_data/Jacob_config/kim_vilt/Kim_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_2e-4_iter_30_with_VLP.yaml (in=1350) (out=592) (deflated 56%) adding: aux_data/Jacob_config/kim_vilt/Kim_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch32_384_lr_5e-6_iter_30_with_VLP.yaml (in=1353) (out=593) (deflated 56%) adding: aux_data/Jacob_config/kim_vilt/Kim_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_5_with_VLP_SCST.yaml (in=1516) (out=630) (deflated 58%) adding: aux_data/Jacob_config/kim_vilt/Kim_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch32_384_lr_5e-5_iter_30_with_VLP.yaml (in=1353) (out=592) (deflated 56%) adding: aux_data/Jacob_config/kim_vilt/vqa/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/kim_vilt/vqa/Kim_Vilt_vqa_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_eval.yaml (in=1305) (out=587) (deflated 55%) adding: aux_data/Jacob_config/kim_vilt/vqa/Kim_Vilt_vqa_testing_batch-size_256_encoder_vit_base_patch32_384_lr_5e-5_iter_10_with_VLP.yaml (in=1352) (out=595) (deflated 56%) adding: aux_data/Jacob_config/kim_vilt/vqa/Kim_Vilt_vqa_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_kim_classifier_.yaml (in=1397) (out=607) (deflated 57%) adding: aux_data/Jacob_config/kim_vilt/vqa/Kim_Vilt_vqa_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP.yaml (in=1353) (out=598) (deflated 56%) adding: aux_data/Jacob_config/kim_vilt/vqa/Kim_Vilt_vqa_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_kim_classifier_dropout_0.1.yaml (in=1397) (out=606) (deflated 57%) adding: aux_data/Jacob_config/kim_vilt/vqa/Kim_Vilt_vqa_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP.yaml (in=1353) (out=597) (deflated 56%) adding: aux_data/Jacob_config/kim_vilt/vqa/Kim_Vilt_vqa_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_60_with_VLP.yaml (in=1353) (out=598) (deflated 56%) adding: aux_data/Jacob_config/kim_vilt/vqa/Kim_Vilt_vqa_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_eval2.yaml (in=1326) (out=592) (deflated 55%) adding: aux_data/Jacob_config/kim_vilt/vqa/Kim_Vilt_vqa_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_kim_classifier_dropout_0.3.yaml (in=1381) (out=603) (deflated 56%) adding: aux_data/Jacob_config/kim_vilt/vqa/Kim_Vilt_vqa_testing_batch-size_256_encoder_vit_base_patch32_384_lr_5e-5_iter_30_with_VLP.yaml (in=1352) (out=596) (deflated 56%) adding: aux_data/Jacob_config/kim_vilt/vqa/Kim_Vilt_vqa_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_kim_classifier.yaml (in=1373) (out=600) (deflated 56%) adding: aux_data/Jacob_config/ViTCAP/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/ViTCAP/Jacob_VLP-ViTCAP_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_120_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_nltk_singlenode_test.yaml (in=2055) (out=866) (deflated 58%) adding: aux_data/Jacob_config/ViTCAP/Jacob_VLP-ViTCAP_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_120_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_nltk-4nodetest.yaml (in=1897) (out=813) (deflated 57%) adding: aux_data/Jacob_config/ViTCAP/Jacob_VLP-ViTCAP_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_120_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_nltk.yaml (in=2036) (out=860) (deflated 58%) adding: aux_data/Jacob_config/ViTCAP/Jacob_VLP-ViTCAP_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_120_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_nltk-singlenodetest.yaml (in=1896) (out=813) (deflated 57%) adding: aux_data/Jacob_config/ViTCAP/Jacob_VLP-ViTCAP_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_120_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_nltk_nodistill.yaml (in=2051) (out=860) (deflated 58%) adding: aux_data/Jacob_config/VLP_distill/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/VLP_distill/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-5_small_0.9_without_VLP_Teacher_eff0fpeter_1ce_1logit_0hid_batch_size_4096_iter7_resume.yaml (in=2054) (out=820) (deflated 60%) adding: aux_data/Jacob_config/VLP_distill/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_iter_20_small_0.9_without_VLP_Teacher_eff0fpeter_1ce_1logit_0hid_batch_size_4096_iter20.yaml (in=1861) (out=785) (deflated 58%) adding: aux_data/Jacob_config/VLP_distill/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_iter_20_small_0.9_without_VLP_Teacher_eff0fpeter_1ce_0logit_0hid_batch_size_512_iter20.yaml (in=1858) (out=783) (deflated 58%) adding: aux_data/Jacob_config/VLP_distill/vinvl_distill/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/VLP_distill/vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_10hid_batch_size_4096_iter20_FCDistill_AllHidden.yaml (in=1773) (out=727) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_10hid_batch_size_4096_iter40_FCDistill_AllHidden.yaml (in=1771) (out=726) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_batch_size_2048_iter10.yaml (in=1701) (out=701) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_10hid_batch_size_4096_iter10_FCDistill_AllHidden.yaml (in=1771) (out=727) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_10hid_batch_size_1024_iter20_FCDistill_AllHidden_debug.yaml (in=1798) (out=732) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_10hid_batch_size_4096_iter10.yaml (in=1731) (out=713) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_4096_iter10.yaml (in=1728) (out=714) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_10hid_batch_size_4096_iter10_FCDistill.yaml (in=1755) (out=725) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_batch_size_4096_iter10.yaml (in=1701) (out=702) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_10hid_batch_size_4096_iter100_FCDistill_AllHidden.yaml (in=1774) (out=728) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_iter_20_small_0.9_without_VLP_Teacher_eff0fpeter_1ce_0logit_0hid_batch_size_128_iter20.yaml (in=1820) (out=781) (deflated 57%) adding: aux_data/Jacob_config/VLP_distill/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_iter_20_small_0.9_without_VLP_Teacher_eff0fpeter_1ce_1logit_0hid_batch_size_1024_iter20.yaml (in=1861) (out=783) (deflated 58%) adding: aux_data/Jacob_config/VLP_distill/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_iter_20_small_0.9_without_VLP_Teacher_eff0fpeter_1ce_1logit_0hid_batch_size_512_iter20.yaml (in=1858) (out=783) (deflated 58%) adding: aux_data/Jacob_config/VLP_distill/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_iter_20_small_0.9_without_VLP_Teacher_eff0fpeter_1ce_1logit_1hid_batch_size_4096_iter20.yaml (in=1861) (out=783) (deflated 58%) adding: aux_data/Jacob_config/VLP_distill/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_iter_20_small_0.9_without_VLP_Teacher_eff0fpeter_1ce_0logit_0hid_batch_size_2048_iter20.yaml (in=1861) (out=782) (deflated 58%) adding: aux_data/Jacob_config/VLP_distill/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_iter_20_small_0.9_without_VLP_Teacher_eff0fpeter_1ce_1logit_0hid_batch_size_2048_iter20.yaml (in=1861) (out=782) (deflated 58%) adding: aux_data/Jacob_config/VLP_distill/s2s_bid_vinvl_distill/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/VLP_distill/s2s_bid_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_4096_iter10_s2s.yaml (in=1792) (out=739) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/s2s_bid_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_1hid_batch_size_4096_iter10_s2s.yaml (in=1785) (out=736) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/s2s_bid_vinvl_distill/priori/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/VLP_distill/s2s_bid_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10hid_batch_size_4096_iter10_s2s_FC_all_token.yaml (in=1842) (out=747) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/s2s_bid_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_2048_iter10_s2s_finetuned.yaml (in=1963) (out=822) (deflated 58%) adding: aux_data/Jacob_config/VLP_distill/s2s_bid_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_512_iter100_s2s_multi_scale.yaml (in=1888) (out=763) (deflated 60%) adding: aux_data/Jacob_config/VLP_distill/s2s_bid_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_4096_iter10_s2s_FC.yaml (in=1825) (out=740) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/s2s_bid_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10hid_batch_size_4096_iter10_s2s_FC.yaml (in=1821) (out=739) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/s2s_bid_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_1_logit_10hid_batch_size_4096_iter10_s2s_FC.yaml (in=1807) (out=751) (deflated 58%) adding: aux_data/Jacob_config/VLP_distill/s2s_bid_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_0ce_10logit_batch_size_4096_iter10_s2s.yaml (in=1792) (out=738) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/s2s_bid_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_1_logit_1hid_batch_size_4096_iter10_s2s.yaml (in=1798) (out=738) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/s2s_bid_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10hid_batch_size_4096_iter10_s2s_FC.yaml (in=1793) (out=740) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/s2s_bid_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_1_logit_batch_size_4096_iter10_s2s.yaml (in=1789) (out=737) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_1logit_10hid_batch_size_4096_iter10_s2s.yaml (in=1759) (out=725) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_4096_iter10_s2s.yaml (in=1747) (out=724) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_1_logit_10_hid_batch_size_2048_iter10_s2s.yaml (in=1756) (out=723) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_1_logit_batch_size_2048_iter40_s2s.yaml (in=1744) (out=721) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_1logit_10hid_batch_size_4096_iter10_s2s_FC.yaml (in=1764) (out=728) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_batch_size_2048_iter10_s2s.yaml (in=1732) (out=716) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale_multi_crop_sample_0.9.yaml (in=1991) (out=774) (deflated 61%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale.yaml (in=1845) (out=739) (deflated 60%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale.yaml (in=1845) (out=740) (deflated 60%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_0logit_10hid_batch_size_4096_iter10_s2s_FC_alltoken.yaml (in=1838) (out=751) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_512_iter100_s2s_multi_scale.yaml (in=1848) (out=739) (deflated 60%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter100_s2s_multi_scale_crop.yaml (in=1950) (out=762) (deflated 61%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale_multi_crop_sample_0.6.yaml (in=1991) (out=776) (deflated 61%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale_multi_crop.yaml (in=1888) (out=749) (deflated 60%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale_crop_4_8.yaml (in=1947) (out=761) (deflated 61%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_0logit_10hid_batch_size_4096_iter10_s2s_FC.yaml (in=1817) (out=742) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale_multi_crop_sample_0.8.yaml (in=1991) (out=776) (deflated 61%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale_crop_2_4.yaml (in=1947) (out=761) (deflated 61%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_4096_iter10_s2s_FC.yaml (in=1805) (out=739) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/priori/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_1024_iter100_s2s_multi_scale.yaml (in=1851) (out=740) (deflated 60%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/gumbel/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/gumbel/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale_multi_crop_gumbel_sample_0.8_10logit_continue_2epoch.yaml (in=2297) (out=850) (deflated 63%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/gumbel/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale_multi_crop_gumbel_sample_0.9.yaml (in=2012) (out=783) (deflated 61%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/gumbel/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale_multi_crop_gumbel_sample_0.9_10logit.yaml (in=2028) (out=787) (deflated 61%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/gumbel/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale_multi_crop_gumbel_sample_0.8.yaml (in=2019) (out=788) (deflated 61%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/gumbel/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale_multi_crop_gumbel_sample_0.8_10logit.yaml (in=2035) (out=792) (deflated 61%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.08_VinVL_Teacher_vinvl_1ce_10logit_batch_size_512_iter80_s2s_multi_scale_crop_1_2_4.yaml (in=1924) (out=759) (deflated 61%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_512_iter80_s2s_multi_scale_crop_1_2_4.yaml (in=1921) (out=758) (deflated 61%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_512_iter40_s2s_16_384_small_0.08.yaml (in=1856) (out=746) (deflated 60%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/Jacob_VLP_Distill_ENC_DEC_vit_base_patch16_384_lr_1e-4_small_0.08_VinVL_Teacher_vinvl_1ce_10logit_batch_size_1024_iter80_s2s.yaml (in=1963) (out=787) (deflated 60%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/Jacob_vinvl_large_4M_seq2seq_uni_batch-size_1024_lr_1e-4_iter_80.yaml (in=1170) (out=541) (deflated 54%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/Jacob_VLP_Distill_ENC_DEC_vit_base_patch16_384_lr_1e-4_small_0.08_VinVL_Teacher_vinvl_1ce_10logit_batch_size_512_iter80_s2s.yaml (in=1960) (out=788) (deflated 60%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_512_iter80_s2s_multi_scale_crop_1_1_1.yaml (in=1888) (out=754) (deflated 60%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_512_iter10_s2s_multi_scale_crop_1_2_4.yaml (in=1922) (out=757) (deflated 61%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.08_VinVL_Teacher_vinvl_1ce_10logit_batch_size_512_iter80_s2s_multi_scale_crop_1_2_4_resume.yaml (in=2161) (out=818) (deflated 62%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.08_VinVL_Teacher_vinvl_1ce_10logit_batch_size_512_iter10_s2s_multi_scale_crop_1_2_4_s2s_teacher.yaml (in=2048) (out=812) (deflated 60%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_512_iter40_s2s_multi_scale_crop_1_2_4.yaml (in=1899) (out=754) (deflated 60%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/ViTCAP/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/ViTCAP/Jacob_VLP-ViTCAP_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_120_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_nltk_singlenode_test.yaml (in=2055) (out=866) (deflated 58%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/ViTCAP/Jacob_VLP-ViTCAP_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_120_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_nltk-4nodetest.yaml (in=1897) (out=813) (deflated 57%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/ViTCAP/Jacob_VLP-ViTCAP_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_120_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_nltk.yaml (in=2024) (out=856) (deflated 58%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/ViTCAP/Jacob_VLP-ViTCAP_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_120_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_nltk-singlenodetest.yaml (in=1896) (out=813) (deflated 57%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_512_iter10_s2s_multi_scale_crop_1_2_4_small_0.08.yaml (in=1902) (out=753) (deflated 60%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Distill_ENC_DEC_vit_base_patch16_384_lr_1e-4_small_0.08_VinVL_Teacher_vinvl_1ce_10logit_batch_size_1024_iter80_s2s.yaml (in=2031) (out=803) (deflated 60%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-5_iter_10_vinvl_tags_ENC-DEC_Split_4_multiplier_0.1_32_gpu_cls_embedding_tagger_from_scratch_gen-ratio_6e-1.yaml (in=2008) (out=856) (deflated 57%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Logit_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-5_iter_80_vinvl_tags_ENC-DEC_Split_4_multiplier_0.1_32_gpu_cls_embedding.yaml (in=1919) (out=819) (deflated 57%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-5_iter_80_vinvl_tags_ENC-DEC_Split_4_multiplier_0.1_32_gpu.yaml (in=1869) (out=806) (deflated 57%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-5_iter_10_vinvl_tags_ENC-DEC_Split_4_multiplier_0.1_32_gpu_cls_embedding_tagger_from_scratch_gen-ratio_0e-1.yaml (in=2008) (out=855) (deflated 57%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Distill_ENC_DEC_vit_base_patch16_384_lr_1e-4_small_0.08_VinVL_Teacher_vinvl_1ce_10logit_batch_size_1024_iter80_s2s_init_vitbfocal40.yaml (in=2197) (out=887) (deflated 60%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_80_vinvl_tags_ENC-DEC_Split_4_multiplier_0.1.yaml (in=1809) (out=770) (deflated 57%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Logit_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-5_iter_80_vinvl_tags_ENC-DEC_Split_4_multiplier_0.1_32_gpu.yaml (in=1897) (out=814) (deflated 57%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_multiplier_0.1.yaml (in=1770) (out=754) (deflated 57%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-5_iter_10_vinvl_tags_ENC-DEC_Split_4_multiplier_0.1_32_gpu_cls_embedding_tagger_from_scratch_gen-ratio_1e-1.yaml (in=2008) (out=855) (deflated 57%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-5_iter_10_vinvl_tags_ENC-DEC_Split_4_multiplier_0.1_32_gpu_cls_embedding_tagger_from_scratch_gen-ratio_2e-1.yaml (in=2008) (out=853) (deflated 58%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-5_iter_10_vinvl_tags_ENC-DEC_Split_4_multiplier_0.1_32_gpu_cls_embedding_tagger_from_scratch_gen-ratio_3e-1.yaml (in=2008) (out=856) (deflated 57%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-5_iter_10_vinvl_tags_ENC-DEC_Split_4_multiplier_0.1_32_gpu_cls_embedding_tagger_from_scratch_gen-ratio_8e-1.yaml (in=2008) (out=855) (deflated 57%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_80_vinvl_tags_ENC-DEC_Split_8_multiplier_0.1.yaml (in=1809) (out=769) (deflated 57%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-5_iter_10_vinvl_tags_ENC-DEC_Split_4_multiplier_0.1_32_gpu_cls_embedding_tagger_from_scratch_gen-ratio_1e-1_test.yaml (in=2007) (out=852) (deflated 58%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Logit_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-5_iter_80_vinvl_tags_ENC-DEC_Split_8_multiplier_0.1_32_gpu.yaml (in=1813) (out=766) (deflated 58%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/large_scale/encdec_with_tags/Jacob_VLP_Distill_ENC_DEC_vit_base_patch16_384_lr_1e-4_small_0.08_VinVL_Teacher_vinvl_1ce_10logit_batch_size_1024_iter80_s2s_init_vitbfocal40_tags_vvitbfocal40crop008.yaml (in=2262) (out=903) (deflated 60%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_0logit_10hid_batch_size_4096_iter10_s2s_NOFC.yaml (in=1769) (out=734) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/diff_topk/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/diff_topk/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale_multi_crop_gumbel_sample_0.9_10logit.yaml (in=2022) (out=787) (deflated 61%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/diff_topk/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_small_0.9_VinVL_Teacher_vinvl_1ce_10logit_batch_size_256_iter10_s2s_multi_scale_multi_crop_gumbel_sample_0.8_10logit.yaml (in=2022) (out=790) (deflated 61%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_1_logit_batch_size_4096_iter80_s2s.yaml (in=1744) (out=722) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_0logit_10hid_batch_size_4096_iter10_s2s_FC.yaml (in=1764) (out=729) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_1_logit_batch_size_2048_iter10_s2s.yaml (in=1744) (out=720) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/s2s_vinvl_distill/Jacob_VLP_Distill_encoder_vit_base_patch32_384_lr_1e-4_iter_20_small_0.9_VinVL_Teacher_vinvl_1ce_1_logit_batch_size_4096_iter10_s2s.yaml (in=1748) (out=722) (deflated 59%) adding: aux_data/Jacob_config/VLP_distill/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_iter_20_small_0.9_without_VLP_Teacher_eff0fpeter_1ce_0logit_0hid_batch_size_4096_iter20.yaml (in=1861) (out=784) (deflated 58%) adding: aux_data/Jacob_config/VLP_distill/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-4_iter_20_small_0.9_without_VLP_Teacher_eff0fpeter_1ce_0logit_0hid_batch_size_1024_iter20.yaml (in=1861) (out=783) (deflated 58%) adding: aux_data/Jacob_config/VLP_distill/Jacob_VLP_Distill_encoder_vit_base_patch16_384_lr_1e-5_small_0.9_without_VLP_Teacher_eff0fpeter_1ce_1logit_0hid_batch_size_4096_iter7_resume_.yaml (in=2054) (out=817) (deflated 60%) adding: aux_data/Jacob_config/VLP/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/VLP/TaxCCSBUCocoVGCapSplit_TEST_512.yaml (in=1202) (out=555) (deflated 54%) adding: aux_data/Jacob_config/VLP/multi_scale/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/VLP/multi_scale/TaxCCSBUCocoVGCapSplit_MultiScale_VLP_Vilt-base_VLPS_BS1024_MaxIter30e_LR0.0001_Warm3e.yaml (in=1431) (out=632) (deflated 56%) adding: aux_data/Jacob_config/VLP/multi_scale/others/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/VLP/multi_scale/others/TaxCCSBUCocoVGCapSplit_MultiScale_VLP_Vilt-base_VLPS_BS512_MaxIter30e_LR0.0002_Warm3e_tokensample_378.yaml (in=1440) (out=634) (deflated 56%) adding: aux_data/Jacob_config/VLP/multi_scale/others/TaxCCSBUCocoVGCapSplit_MultiScale_VLP_Vilt-base_VLPS_BS1024_MaxIter30e_LR0.0001_Warm3e_tokensample_378.yaml (in=1458) (out=641) (deflated 56%) adding: aux_data/Jacob_config/VLP/multi_scale/others/TaxCCSBUCocoVGCapSplit_MultiScale_VLP_Vilt-base_VLPS_BS1024_MaxIter30e_LR0.0001_Warm3e.yaml (in=1402) (out=616) (deflated 56%) adding: aux_data/Jacob_config/VLP/multi_scale/others/TaxCCSBUCocoVGCapSplit_MultiScale_VLP_Vilt-base_VLPS_BS1024_MaxIter30e_LR0.0002_Warm3e_tokensample_378.yaml (in=1458) (out=643) (deflated 56%) adding: aux_data/Jacob_config/VLP/TaxCOCOCaption_TEST_512.yaml (in=1169) (out=546) (deflated 53%) adding: aux_data/Jacob_config/VLP/Jacob_Vilt_VLP_TaxCCSBUCocoVGCap_iter_80_lr_1e-4.yaml (in=1206) (out=558) (deflated 54%) adding: aux_data/Jacob_config/VLP/others/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_Vilt-base_VLPS_BS4096_MaxIter40e_LR0.0001_Warm3e_dataaug_384transform.yaml (in=894) (out=456) (deflated 49%) adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_Vilt-base_VLPS_BS4096_MaxIter40e_LR0.0004_Warm3e.yaml (in=838) (out=438) (deflated 48%) adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_MiniVLM_VLPS_BS4096_MaxIter10e_LR0.00056_Warm1e_Feff0f_Leff0f_base12.yaml (in=967) (out=476) (deflated 51%) adding: aux_data/Jacob_config/VLP/others/COCO_VLP_MiniVLM_VLPS_BS2048_MaxIter100e_LR0.0004_Warm5e_Feff0f_Leff0f_base12.yaml (in=848) (out=439) (deflated 48%) adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_Vilt-base_VLPS_BS4096_MaxIter40e_LR0.0001_Warm3e_dataaug.yaml (in=878) (out=449) (deflated 49%) adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_MiniVLM_VLPS_BS2048_MaxIter100e_LR0.0004_Warm5e_Feff0f_Leff3f_base12.yaml (in=941) (out=485) (deflated 48%) adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_MiniVLM_VLPS_BS2048_MaxIter5e_LR0.0004_Warm1e_Feff0f_Leff0f_base12.yaml (in=961) (out=474) (deflated 51%) adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_MiniVLM_VLPS_BS2048_MaxIter100e_LR0.0004_Warm5e_Feff0f_Leff0f_base12.yaml (in=941) (out=482) (deflated 49%) adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_Vilt-base_VLPS_BS4096_MaxIter40e_LR0.0001_Warm3e_dataaug_384transform_exp4.yaml (in=904) (out=459) (deflated 49%) adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_Vilt-base_VLPS_BS4096_MaxIter40e_LR0.0001_Warm3e.yaml (in=840) (out=439) (deflated 48%) adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_MiniVLM_VLPS_BS4096_MaxIter30e_LR0.00056_Warm0e_Feff0f_Leff0f_base12_908.yaml (in=1127) (out=533) (deflated 53%) adding: aux_data/Jacob_config/VLP/others/CC_VLP_MiniVLM_VLPS_BS2048_MaxIter100e_LR0.0004_Warm5e_Feff0f_Leff0f_base12.yaml (in=882) (out=453) (deflated 49%) adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_MiniVLM_VLPS_BS4096_MaxIter1e_LR0.00056_Warm0e_Feff0f_Leff0f_base12.yaml (in=964) (out=475) (deflated 51%) adding: aux_data/Jacob_config/VLP/others/COCO_VLP_MiniVLM_VLPS_BS2048_MaxIter100e_LR0.00008_Warm5e_Feff0f_Leff0f_base12_vlp765.yaml (in=955) (out=466) (deflated 51%) adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_Oscarbase_VLPS_BS2048_MaxIter100e_LR0.0002_Warm5e_Feff0fpeter_Leff0fpeter_base12.yaml (in=999) (out=476) (deflated 52%) adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_MiniVLM_VLPS_BS4096_MaxIter50e_LR0.00056_Warm5e_Feff0f_Leff0f_base12.yaml (in=967) (out=478) (deflated 51%) adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_MiniVLM_VLPS_BS4096_MaxIter20e_LR0.00056_Warm5e_Feff0f_Leff0f_base12.yaml (in=967) (out=478) (deflated 51%) adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_Oscarbase_VLPS_BS2048_MaxIter100e_LR0.0004_Warm5e_Feff0fpeter_Leff0fpeter_base12.yaml (in=999) (out=475) (deflated 52%) adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_Oscarbase_VLPS_BS2048_MaxIter100e_LR0.0001_Warm5e_Feff0fpeter_Leff0fpeter_base12.yaml (in=999) (out=476) (deflated 52%) adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_MiniVLM_VLPS_BS4096_MaxIter50e_LR0.0008_Warm5e_Feff0f_Leff0f_base12.yaml (in=964) (out=477) (deflated 51%) adding: aux_data/Jacob_config/VLP/others/TaxCCSBUCocoGQAFlk30VqaVGqaSplit_VLP_Vilt-base_VLPS_BS4096_MaxIter40e_LR0.0001_Warm3e_dataaug_384transform_exp5.yaml (in=904) (out=459) (deflated 49%) adding: aux_data/Jacob_config/VLP/coco_bidirectional_finetune/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/VLP/coco_bidirectional_finetune/TaxCocoCaption_VLP_Vilt-base_VLPS_BS4096_MaxIter30e_LR5.0e-4_Warm3e.yaml (in=1155) (out=527) (deflated 54%) adding: aux_data/Jacob_config/VLP/TaxCCSBUCocoVGCapSplit_TEST.yaml (in=1202) (out=556) (deflated 54%) adding: aux_data/Jacob_config/CIDEr_optimize/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/CIDEr_optimize/cider_Jacob_Logit_Distill_Vilt_captioning_testing_lr_1.0e-4_iter_30_batch-size_256_encoder_vit_base_patch16_384_without_VLP_multi_scale_fix_BS_64_Sample_200.yaml (in=1595) (out=642) (deflated 60%) adding: aux_data/Jacob_config/CIDEr_optimize/cider_Jacob_Logit_Distill_Vilt_captioning_testing_lr_1.0e-4_iter_30_batch-size_256_encoder_vit_base_patch16_384_without_VLP_multi_scale_fix_BS_64_Sample_276.yaml (in=1596) (out=643) (deflated 60%) adding: aux_data/Jacob_config/CIDEr_optimize/JACOB_CIDER_Jacob_Logit_Distill_Vilt_captioning_testing_lr_1.0e-4_iter_30_batch-size_32_ENC_DEC_patch16_384_ENC_DEC_S2S_80_VLP_tags_vitbfocal10_num_1_epoch_5.yaml (in=1754) (out=748) (deflated 57%) adding: aux_data/Jacob_config/CIDEr_optimize/JACOB_CIDER_Jacob_Logit_Distill_Vilt_captioning_testing_lr_5.0e-6_iter_30_batch-size_32_ENC_DEC_patch16_384_ENC_DEC_S2S_80_VLP_tags_vitbfocal10_num_2_epoch_5_lr_3e-6.yaml (in=1752) (out=751) (deflated 57%) adding: aux_data/Jacob_config/CIDEr_optimize/cider_Jacob_Logit_Distill_Vilt_captioning_testing_lr_1.0e-4_iter_30_batch-size_256_encoder_vit_base_patch16_384_without_VLP_multi_scale_fix_BS_64_Sample_300.yaml (in=1596) (out=642) (deflated 60%) adding: aux_data/Jacob_config/CIDEr_optimize/JACOB_CIDER_Jacob_Logit_Distill_Vilt_captioning_testing_lr_1.0e-4_iter_30_batch-size_32_ENC_DEC_patch16_384_ENC_DEC_S2S_80_VLP_tags_vitbfocal10_sample_0.4.yaml (in=1723) (out=736) (deflated 57%) adding: aux_data/Jacob_config/CIDEr_optimize/cider_Jacob_Logit_Distill_Vilt_captioning_testing_lr_1.0e-4_iter_30_batch-size_256_encoder_vit_base_patch16_384_without_VLP_multi_scale_fix_BS_16_Sample_276.yaml (in=1596) (out=640) (deflated 60%) adding: aux_data/Jacob_config/CIDEr_optimize/JACOB_CIDER_Jacob_Logit_Distill_Vilt_captioning_testing_lr_5.0e-6_iter_30_batch-size_32_ENC_DEC_patch16_384_ENC_DEC_NO_VLP_tags_vitbfocal10_num_2_epoch_5_lr_3e-6.yaml (in=1790) (out=777) (deflated 57%) adding: aux_data/Jacob_config/CIDEr_optimize/cider_Jacob_Logit_Distill_Vilt_captioning_testing_lr_1.0e-4_iter_30_batch-size_256_encoder_vit_base_patch16_384_without_VLP_multi_scale_fix.yaml (in=1561) (out=631) (deflated 60%) adding: aux_data/Jacob_config/CIDEr_optimize/JACOB_CIDER_Jacob_Logit_Distill_Vilt_captioning_testing_lr_1.0e-4_iter_30_batch-size_32_ENC_DEC_patch16_384_ENC_DEC_S2S_80_VLP_tags_vitbfocal10_sample_0.3.yaml (in=1740) (out=743) (deflated 57%) adding: aux_data/Jacob_config/CIDEr_optimize/JACOB_CIDER_Jacob_Logit_Distill_Vilt_captioning_testing_lr_1.0e-4_iter_150_batch-size_32_ENC_DEC_patch16_384_ENC_DEC_S2S_80_VLP_tags_vitbfocal10_num_1_epoch_25.yaml (in=1778) (out=758) (deflated 57%) adding: aux_data/Jacob_config/new_result/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/caption_Embedding/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/caption_Embedding/VinVL_Label_60_epoch_0.05_lrreduc/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/caption_Embedding/VinVL_Label_60_epoch_0.05_lrreduc/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.05_caption_emb.yaml (in=1860) (out=804) (deflated 57%) adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/caption_Embedding/VinVL_Label_60_epoch_0.05_lrreduc/70_driver_log_0 (27).txt (in=3741815) (out=463374) (deflated 88%) adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/New_learnable_embedding_layer/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/New_learnable_embedding_layer/VinVL_Label_60_epoch_0.1_lrreduc/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/New_learnable_embedding_layer/VinVL_Label_60_epoch_0.1_lrreduc/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1.yaml (in=1828) (out=797) (deflated 56%) adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/New_learnable_embedding_layer/VinVL_Label_60_epoch_0.1_lrreduc/70_driver_log_0 (27).txt (in=3834561) (out=472577) (deflated 88%) adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/New_learnable_embedding_layer/VinL_Label_30_epoch_0.05_lrreduc/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/New_learnable_embedding_layer/VinL_Label_30_epoch_0.05_lrreduc/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.05_CIDEr.yaml (in=1831) (out=797) (deflated 56%) adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/New_learnable_embedding_layer/VinL_Label_30_epoch_0.05_lrreduc/70_driver_log_0 (27).txt (in=2519172) (out=300760) (deflated 88%) adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/New_learnable_embedding_layer/VinVL_Label_60_epoch_0.01_lrreduc/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/New_learnable_embedding_layer/VinVL_Label_60_epoch_0.01_lrreduc/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.01.yaml (in=1831) (out=797) (deflated 56%) adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/New_learnable_embedding_layer/VinVL_Label_60_epoch_0.01_lrreduc/70_driver_log_0 (27).txt (in=3741815) (out=463374) (deflated 88%) adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/New_learnable_embedding_layer/VinVL_Label_30_epoch_0.1_lrreduc/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/New_learnable_embedding_layer/VinVL_Label_30_epoch_0.1_lrreduc/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_CIDEr_119.4.yaml (in=1828) (out=796) (deflated 56%) adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/classifier_Embedding/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/classifier_Embedding/VinVL_Label_60_epoch_0.1_lrreduc/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/new_result/coco-caption_no_vlp/classifier_Embedding/VinVL_Label_60_epoch_0.1_lrreduc/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1.yaml (in=1870) (out=811) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_Distill/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_60_without_VLP.yaml (in=1544) (out=677) (deflated 56%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/Logit_Distill_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_without_VLP_multi_scale_0ce_1logit.yaml (in=1704) (out=720) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/Logit_Distill_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_without_VLP_multi_scale_0ce_1logit.yaml (in=1704) (out=721) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_without_VLP_multi_scale_tower_test.yaml (in=2057) (out=737) (deflated 64%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_Zhiyuan-PyTorch-Test_1626887552_3db73349_correct+Distill.yaml (in=2017) (out=779) (deflated 61%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_without_VLP_multi_scale_4_8.yaml (in=1809) (out=712) (deflated 61%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_tags_vitb_ENC-DEC_len70_conf_0.3.yaml (in=1507) (out=628) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale.yaml (in=2043) (out=821) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_without_VLP_multi_scale_random_0.2_inference_sample.yaml (in=1856) (out=731) (deflated 61%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_Zhiyuan-PyTorch-Test_1626887552_3db73349.yaml (in=2143) (out=857) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_uni_vlp_distill.yaml (in=1862) (out=742) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_patch_select_0.9.yaml (in=1787) (out=728) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_128_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_kmeans_select_0.6.yaml (in=1830) (out=746) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_without_VLP_multi_scale_tower.yaml (in=2065) (out=739) (deflated 64%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_no_tags.yaml (in=1786) (out=715) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_topk_attn_token_select_0.6_layer_9_infer_sample.yaml (in=2072) (out=799) (deflated 61%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_random_select_0.6_test_sample.yaml (in=1920) (out=766) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_60_with_VLP_multi_scale_ENC-DEC.yaml (in=1835) (out=728) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_without_VLP_multi_scale_ClipViT.yaml (in=1900) (out=754) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_CC_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_uni_vlp_distill_test.yaml (in=1943) (out=750) (deflated 61%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_multi_scale_token_sample_0.9_patch_select_0.8.yaml (in=1807) (out=732) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_multi_scale_token_sample_0.9_patch_select_identity.yaml (in=1766) (out=709) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_uni_vlp_distill_test.yaml (in=2014) (out=756) (deflated 62%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_tags_eff0f.yaml (in=1789) (out=716) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_60_with_VLP_multi_scale_ENC-ENC.yaml (in=1819) (out=722) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_30__vitbfocal40_vinvl_tags.yaml (in=1757) (out=762) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_64_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_kmeans_select_0.6.yaml (in=1963) (out=751) (deflated 62%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_topk_token_select_0.6.yaml (in=1961) (out=775) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_vlinvits_ENC-DEC.yaml (in=1617) (out=663) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_20_with_VLP_multi_scale_0.9.yaml (in=1774) (out=731) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_without_VLP_multi_scale_tower_seperate_cls_test.yaml (in=2083) (out=745) (deflated 64%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_32_384_lr_1e-4_iter_30_without_VLP_mutual_tower.yaml (in=1918) (out=724) (deflated 62%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_eff2f.yaml (in=1779) (out=712) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_kmeans_select_0.9_centroid.yaml (in=1847) (out=749) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_16_encoder_vit_base_patch16_384_lr_1e-4_iter_30_without_VLP_multi_scale_4_8.yaml (in=1828) (out=720) (deflated 61%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_vlinvits.yaml (in=1809) (out=723) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_80_iter_VLP_multi_scale.yaml (in=2045) (out=814) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_480_lr_1e-4_iter_30.yaml (in=1768) (out=707) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_later_concat_7.yaml (in=1927) (out=772) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_multi_scale_token_sample_0.9_patch_select_identity2.yaml (in=1866) (out=755) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_generate_tags_ENC-DEC.yaml (in=1426) (out=629) (deflated 56%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_without_VLP_multi_scale_random_0.4_inference_sample.yaml (in=1856) (out=730) (deflated 61%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_later_concat_9.yaml (in=1927) (out=772) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_multi_scale_0.9.yaml (in=1774) (out=732) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_batch-size_256_ENC_DEC_encoder_vit_base_patch16_384_lr_5e-5_iter_30_uni_vlp_distill_80_iter_pretraining_with_vitbfocal10_tags.yaml (in=2057) (out=846) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_multi_scale_token_sample_0.9.yaml (in=1873) (out=763) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_vinvl.yaml (in=1789) (out=715) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_60_sqr_attention.yaml (in=1814) (out=718) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_no_tags_ENC-DEC.yaml (in=1829) (out=738) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_multi_scale_ENC-DEC.yaml (in=1831) (out=728) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_sqr_attention_alter.yaml (in=1847) (out=729) (deflated 61%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_multi_scale_token_sample_0.9_patch_select_0.6.yaml (in=1809) (out=734) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_multi_scale_ENC-ENC.yaml (in=1819) (out=722) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_Zhiyuan-PyTorch-Test_1626887552_3db73349_correct_eval.yaml (in=1939) (out=763) (deflated 61%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_batch-size_256_ENC_DEC_encoder_vit_base_patch16_384_lr_5e-5_iter_30_uni_vlp_distill_80_iter_pretraining_inference.yaml (in=1983) (out=822) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_multi_scale_token_sample_0.9-.yaml (in=1889) (out=766) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_without_VLP_multi_scale.yaml (in=1823) (out=716) (deflated 61%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_30__vitbfocal10_vinvl_tags2.yaml (in=1757) (out=761) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_tags_vitb_ENC-DEC_len70.yaml (in=1489) (out=623) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_multi_scale_token_sample_0.9_patch_select_0.9.yaml (in=1807) (out=732) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_kmeans_select_0.3_centroid.yaml (in=1847) (out=749) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_kmeans_select_0.3.yaml (in=1830) (out=748) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_Zhiyuan-PyTorch-Test_1626887552_3db73349_correct+Distill.yaml (in=1998) (out=780) (deflated 61%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_sqr_attention.yaml (in=1814) (out=717) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_without_VLP.yaml (in=1513) (out=639) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_multi_scale_token_sample_0.9_token_select.yaml (in=1728) (out=695) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_kmeans_select_0.6_centroid.yaml (in=1847) (out=748) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_80_iter_VLP_multi_scale3.yaml (in=2046) (out=813) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_later_concat_5.yaml (in=1927) (out=772) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_batch-size_256_ENC_DEC_encoder_vit_base_patch16_384_lr_5e-5_iter_30_uni_vlp_distill_80_iter_pretraining.yaml (in=1972) (out=815) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_kmeans_select_0.9.yaml (in=1830) (out=748) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_multi_scale_token_sample_0.9.yaml (in=1720) (out=695) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_without_VLP_multi_scale_random_0.2.yaml (in=1821) (out=723) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding-distill/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding-distill/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_distill.yaml (in=2059) (out=865) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding-distill/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_distill_not_all_token.yaml (in=2086) (out=872) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding-distill/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_distill_s=8.yaml (in=2067) (out=871) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-single-tower-vinvl/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-single-tower-vinvl/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_vinvl.yaml (in=1758) (out=710) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-vitbfocal40crop008/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-vitbfocal40crop008/Logit_Vilt_captioning_testing_batch-size_1024_ENC_DEC_vit_base_patch32_384_lr_5e-5_iter_30_VinVL_tags_ENC-DEC.yaml (in=1506) (out=625) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_multiplier_0.1_concat.yaml (in=1764) (out=755) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_multiplier_0.1.yaml (in=1751) (out=749) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_multiplier_0.1_split_8.yaml (in=1781) (out=755) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_3e-5_iter_30_vitbfocal10_tags_ENC-DEC_vitbfocal40.yaml (in=1695) (out=733) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_multiplier_0.01_inference.yaml (in=1749) (out=751) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_10caption_loss_1tag_loss.yaml (in=1721) (out=737) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_multiplier_0.2.yaml (in=1750) (out=748) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_3e-5_iter_30_vitbfocal10_tags_ENC-DEC.yaml (in=1671) (out=728) (deflated 56%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_10caption_loss_1tag_loss.yaml (in=1721) (out=737) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_multiplier_1.yaml (in=1745) (out=746) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_multiplier_1_inference.yaml (in=1740) (out=747) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_multiplier_0.1_differentiable-topk.yaml (in=1816) (out=772) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_multiplier_0.01.yaml (in=1754) (out=749) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_multiplier_0.1_inference.yaml (in=1744) (out=747) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_multiplier_0.5.yaml (in=1750) (out=748) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_multiplier_0.1_split_8_inference.yaml (in=1774) (out=753) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10-caption_tagging/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-5_iter_30_vitbfocal10_tags_ENC-DEC_multiplier_0.1_concat_inference.yaml (in=1758) (out=753) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-eff0f/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-eff0f/Logit_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_eff0f_ENC-DEC_len70.yaml (in=1506) (out=626) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-vitbfocal10/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-vitbfocal10/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_5e-5_iter_30_tags_vitbfocal10_ENC-DEC_len70.yaml (in=1483) (out=631) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-vitbfocal10/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_vitbfocal10_ENC-DEC_len70.yaml (in=1489) (out=621) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-vitbfocal10/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_5e-5_iter_30_tags_vitbfocal10_ENC-DEC_len70.yaml (in=1489) (out=624) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-vitbfocal10/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_vitbfocal10_ENC-DEC_len70.yaml (in=1483) (out=628) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-vitbfocal10/Logit_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_vitbfocal10_ENC-DEC_len70.yaml (in=1504) (out=629) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-vitbfocal40/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-vitbfocal40/Logit_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_vitbfocal40_ENC-DEC_len70.yaml (in=1492) (out=625) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-VinVL/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-VinVL/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_VinVL_tags_ENC-DEC.yaml (in=1464) (out=618) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_005_noexpand.yaml (in=1811) (out=787) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_split_8.yaml (in=1832) (out=797) (deflated 56%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1.yaml (in=1816) (out=792) (deflated 56%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.2_split_8.yaml (in=1832) (out=796) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_005.yaml (in=1817) (out=793) (deflated 56%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.2.yaml (in=1816) (out=792) (deflated 56%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_clsemb_all_tokens.yaml (in=1897) (out=819) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_new_bertemb_test.yaml (in=1865) (out=806) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert.yaml (in=1917) (out=815) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_bertemb_all_tokens.yaml (in=1900) (out=814) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_init_from_caption_only-noall.yaml (in=1992) (out=833) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_init_from_caption_only.yaml (in=1980) (out=827) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_pred_tag_caption.yaml (in=1977) (out=832) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_new_bertemb.yaml (in=1871) (out=808) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_no_tags.yaml (in=1843) (out=803) (deflated 56%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_init_from_caption-all+vinvl.yaml (in=1993) (out=838) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_bertemb.yaml (in=1863) (out=805) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_clsemb.yaml (in=1860) (out=812) (deflated 56%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_nltk.yaml (in=1903) (out=818) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_new_bertemb_all_tokens2.yaml (in=1910) (out=819) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_fuse_pred_tag_caption.yaml (in=2015) (out=847) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_init_from_caption+vinvl.yaml (in=1974) (out=834) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_gt_tag_caption.yaml (in=1971) (out=831) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_new_bertemb_all_tokens.yaml (in=1908) (out=815) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_new_bertemb_all_tokens-gradient.yaml (in=1961) (out=825) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/v2/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_caption_tags_encode_bert_init_from_caption_oscar.yaml (in=1981) (out=848) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.2_split_8.yaml (in=1832) (out=796) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_split_8.yaml (in=1832) (out=797) (deflated 56%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1.yaml (in=1810) (out=790) (deflated 56%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_tie_tag_bert_weight.yaml (in=1860) (out=801) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_bertemb.yaml (in=1857) (out=803) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_clsemb.yaml (in=1854) (out=809) (deflated 56%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.3.yaml (in=1816) (out=793) (deflated 56%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_005_noexpand.yaml (in=1811) (out=787) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_clsemb2_nograd.yaml (in=1870) (out=814) (deflated 56%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_noexpand.yaml (in=1810) (out=786) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_clsemb_nograd.yaml (in=1868) (out=814) (deflated 56%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_005.yaml (in=1817) (out=793) (deflated 56%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal20-bert-tokenizer-expanding/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_clsemb2_nograd_alltokens.yaml (in=1907) (out=823) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal10/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_5e-5_iter_30_VinVL_tags_ENC-DEC.yaml (in=1482) (out=624) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal40crop008/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal40crop008/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_5e-5_iter_30_VinVL_tags_ENC-DEC.yaml (in=1503) (out=630) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal40crop008/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_5e-5_iter_30_VinVL_tags_ENC-DEC_conf_0.4.yaml (in=1521) (out=634) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal40crop008/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_5e-5_iter_30_VinVL_tags_ENC-DEC_conf_0.8.yaml (in=1521) (out=633) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/16-384-vitbfocal40crop008/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_5e-5_iter_30_VinVL_tags_ENC-DEC_conf_0.6.yaml (in=1521) (out=634) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-VinVL/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/tagger_caption/32-384-VinVL/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_vinvl_ENC-DEC_len70.yaml (in=1517) (out=628) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_patch_select_1.0.yaml (in=1753) (out=719) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_Zhiyuan-PyTorch-Test_1626887552_3db73349_correct.yaml (in=2160) (out=858) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_VLP_multi_scale_test.yaml (in=1825) (out=719) (deflated 61%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_1e-4_iter_30__vitbfocal10_vinvl_tags.yaml (in=1757) (out=762) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_linvits_test.yaml (in=1789) (out=715) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_Zhiyuan-PyTorch-Test_1626596599_66f2f469.yaml (in=2138) (out=856) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_tags_vitb_ENC-DEC_len70_conf_0.5.yaml (in=1507) (out=628) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_vinvl_ENC-DEC_len70.yaml (in=1506) (out=622) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_generate_tags_ENC-DEC_inference.yaml (in=1420) (out=627) (deflated 56%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_128_encoder_vit_base_patch16_480_lr_1e-4_iter_30_tags_linvits.yaml (in=1795) (out=719) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_16_224_lr_1e-4_iter_30_without_VLP_multi_two_tower.yaml (in=1911) (out=723) (deflated 62%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_0.9.yaml (in=2050) (out=821) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_vinvl_ENC-DEC_len70_nopad.yaml (in=1514) (out=632) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_later_concat_2.yaml (in=1927) (out=770) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_multi_scale_token_sample_0.7.yaml (in=1873) (out=764) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_16_384_lr_1e-4_iter_30_without_VLP_multi_two_tower.yaml (in=1911) (out=725) (deflated 62%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_VinVL_tags_ENC-DEC.yaml (in=1464) (out=618) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_80_iter_VLP_multi_scale2.yaml (in=2046) (out=815) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_uni_vlp_distill_80_vinvl_tag-iter_pretraining.yaml (in=2189) (out=883) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_20_with_VLP_multi_scale.yaml (in=1741) (out=722) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_uni_vlp_distill_80_iter_pretraining_test.yaml (in=2142) (out=784) (deflated 63%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_kmeans_select_0.6.yaml (in=1830) (out=746) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_topk_attn_token_select_0.6_layer_9_no_infer.yaml (in=2067) (out=799) (deflated 61%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_without_VLP_multi_scale_tower_large_test.yaml (in=2095) (out=750) (deflated 64%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_uni_vlp_distill_80_vinvl-vitfocal40_tag-iter_pretraining.yaml (in=2224) (out=893) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_batch-size_256_ENC_DEC_encoder_vit_base_patch16_384_lr_5e-5_iter_30_uni_vlp_distill_80_iter_pretraining_with_vinvl_tags.yaml (in=2039) (out=834) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_tags_linvits.yaml (in=1796) (out=719) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_eff0f.yaml (in=1781) (out=714) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_without_VLP_multi_scale_self_attention_0.6.yaml (in=2340) (out=898) (deflated 62%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch32_384_lr_1e-4_iter_30_no_tags.yaml (in=1567) (out=654) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_random_select_0.6.yaml (in=1923) (out=768) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_linvits.yaml (in=1795) (out=717) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_patch_select_0.6.yaml (in=1788) (out=731) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_eff4f.yaml (in=1779) (out=712) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_later_concat_4.yaml (in=1927) (out=770) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_multi_scale_token_sample_0.9_patch_select_identity3.yaml (in=1866) (out=755) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_without_VLP_multi_scale_random_0.3.yaml (in=1822) (out=721) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_no_tags_ENC-DEC_80_epoch_S2S_pre-training.yaml (in=1875) (out=783) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_without_VLP_multi_scale_token_drop.yaml (in=1856) (out=724) (deflated 61%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_ENC-DEC.yaml (in=1835) (out=728) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_patch_select_0.3.yaml (in=1787) (out=728) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_multi_scale_token_sample_0.8.yaml (in=1873) (out=763) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_uni_vlp_distill_80_iter_pretraining.yaml (in=2125) (out=859) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_10_with_VLP_multi_scale_token_sample_0.9_patch_select.yaml (in=1748) (out=703) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_multi_scale_token_sample_0.9_.yaml (in=1882) (out=764) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_eff0f_test.yaml (in=1775) (out=712) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_later_concat_6.yaml (in=1927) (out=771) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_without_VLP_multi_scale_tower_large.yaml (in=2077) (out=745) (deflated 64%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_30_no_tags.yaml (in=1786) (out=715) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_later_concat_3.yaml (in=1927) (out=772) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch32_384_lr_1e-4_iter_60_sqr_attention_alter.yaml (in=1847) (out=729) (deflated 61%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_batch-size_256_ENC_DEC_encoder_vit_base_patch16_384_lr_5e-5_iter_30_uni_vlp_distill_80_iter_pretraining_with_vinvl_tags_inference.yaml (in=1825) (out=745) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_tags_vinvl.yaml (in=1789) (out=715) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_with_VLP_multi_scale_later_concat_8.yaml (in=1927) (out=772) (deflated 60%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Distill_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch32_384_lr_1e-4_iter_30_with_VLP_multi_scale.yaml (in=1767) (out=731) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_no_tags_ENC-DEC.yaml (in=1446) (out=609) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/priori/Logit_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch32_384_lr_1e-4_iter_30_tags_vinvl_ENC-DEC.yaml (in=1505) (out=624) (deflated 59%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_30_without_VLP_multi_scale.yaml (in=1682) (out=715) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_60_without_VLP_multi_scale.yaml (in=1682) (out=714) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1e-4_iter_60_without_VLP_multi_scale_0ce_1logit.yaml (in=1704) (out=721) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/Logit_Distill_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_30_without_VLP_multi_scale_0ce_1logit_10hidden.yaml (in=1736) (out=731) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_384_lr_1.0e-4_iter_30_without_VLP_multi_scale_token_sample_378.yaml (in=1726) (out=736) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/multi_scale/Logit_Distill_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_without_VLP_multi_scale.yaml (in=1682) (out=714) (deflated 58%) adding: aux_data/Jacob_config/coco_Distill/Logit_Distill_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_224_lr_1e-4_iter_30_without_VLP.yaml (in=1546) (out=677) (deflated 56%) adding: aux_data/Jacob_config/coco_Distill/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_224_lr_1e-4_iter_30_without_VLP.yaml (in=1546) (out=677) (deflated 56%) adding: aux_data/Jacob_config/coco_Distill/Textual_hid_Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_224_lr_1e-4_iter_30_without_VLP_hidden_weight_10.yaml (in=1682) (out=716) (deflated 57%) adding: aux_data/Jacob_config/coco_Distill/Logit_Distill_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch16_384_lr_2e-4_iter_30_without_VLP.yaml (in=1547) (out=679) (deflated 56%) adding: aux_data/Jacob_config/coco_Distill/Logit_Distill_Vilt_captioning_testing_batch-size_1024_encoder_vit_base_patch16_224_lr_2e-4_iter_30_without_VLP.yaml (in=1547) (out=679) (deflated 56%) adding: aux_data/Jacob_config/coco_Distill/Logit_Distill_Vilt_captioning_testing_batch-size_256_encoder_vit_base_patch16_224_lr_1e-4_iter_30_without_VLP_debug.yaml (in=1690) (out=732) (deflated 57%) adding: aux_data/Jacob_config/Tagger/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxOpenImagesV6_B_Vilt_ViT_16_384_10_epoch_lr_5e-5_BS_512.yaml (in=1241) (out=574) (deflated 54%) adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxOpenImagesV6_B_Vilt_ViT_16_224_10_epoch_lr_0.02_BS_512_SGD.yaml (in=1231) (out=589) (deflated 52%) adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_40_epoch_lr_5e-5_BS_1024_loss_rank_crop_0.08.yaml (in=1355) (out=610) (deflated 55%) adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxViLT9M_B_Vilt_ViT_16_384_40_epoch_lr_5e-5_BS_1024_loss_focal_0.5_2.yaml (in=1281) (out=590) (deflated 54%) adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxViLT9M_B_Vilt_ViT_16_384_10_epoch_lr_5e-5_BS_512.yaml (in=1249) (out=577) (deflated 54%) adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_40_epoch_lr_5e-5_BS_1024_loss_rank_crop_0.08_inference.yaml (in=1350) (out=610) (deflated 55%) adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_40_epoch_lr_5e-5_BS_1024_loss_focal_crop_0.9_inference.yaml (in=1350) (out=610) (deflated 55%) adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxOpenImagesV6_B_Vilt_ViT_16_224_10_epoch_lr_0.1_BS_512_SGD.yaml (in=1228) (out=588) (deflated 52%) adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_10_epoch_lr_5e-5_BS_1024_loss_focal_crop_0.08.yaml (in=1358) (out=609) (deflated 55%) adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxViLT9M_B_Vilt_ViT_16_384_10_epoch_lr_5e-5_BS_1024.yaml (in=1252) (out=577) (deflated 54%) adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_40_epoch_lr_5e-5_BS_1024_loss_rank_crop_0.9.yaml (in=1352) (out=610) (deflated 55%) adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxCocoCaption_B_Vilt_ViT_16_384_10_epoch_lr_2e-2_BS_256.yaml (in=1241) (out=573) (deflated 54%) adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxViLT9M_B_Vilt_ViT_16_384_10_epoch_lr_5e-5_BS_1024_loss_rank.yaml (in=1278) (out=589) (deflated 54%) adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxViLT9M_B_Vilt_ViT_16_384_10_epoch_lr_5e-5_BS_1024_loss_focal_0.5_2_inference_tags.yaml (in=1332) (out=603) (deflated 55%) adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxViLT9M_B_Vilt_ViT_16_384_10_epoch_lr_5e-5_BS_1024_loss_focal_0.5_2.yaml (in=1281) (out=590) (deflated 54%) adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_20_epoch_lr_5e-5_BS_1024_loss_focal_crop_0.08_category_bert.yaml (in=1539) (out=665) (deflated 57%) adding: aux_data/Jacob_config/Tagger/ablation/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/Tagger/ablation/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_10_epoch_lr_5e-5_BS_1024_loss_focal_crop_0.08_caption+vinvl.yaml (in=1416) (out=634) (deflated 55%) adding: aux_data/Jacob_config/Tagger/ablation/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_10_epoch_lr_5e-5_BS_1024_loss_focal_crop_0.08_oscar.yaml (in=1469) (out=662) (deflated 55%) adding: aux_data/Jacob_config/Tagger/ablation/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_10_epoch_lr_5e-5_BS_1024_loss_focal_crop_0.08_caption+vinvl_all-tokens.yaml (in=1438) (out=637) (deflated 56%) adding: aux_data/Jacob_config/Tagger/ablation/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_10_epoch_lr_5e-5_BS_1024_loss_focal_crop_0.08_caption_only.yaml (in=1423) (out=636) (deflated 55%) adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxOpenImagesV6_B_Vilt_ViT_16_224_10_epoch_lr_5e-5_BS_512.yaml (in=1244) (out=576) (deflated 54%) adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxViLT9M_B_Vilt_ViT_16_384_10_epoch_lr_5e-5_BS_1024_loss_rank_inference_tags.yaml (in=1329) (out=603) (deflated 55%) adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_40_epoch_lr_5e-5_BS_1024_loss_focal_crop_0.9.yaml (in=1355) (out=610) (deflated 55%) adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxOpenImagesV6_B_Vilt_ViT_16_224_10_epoch_lr_0.1_BS_512_SGD_sigmoid.yaml (in=1274) (out=600) (deflated 53%) adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxViLT9M_B_Vilt_ViT_16_384_40_epoch_lr_5e-5_BS_1024_loss_focal_0.5_2_inference_tags.yaml (in=1357) (out=615) (deflated 55%) adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxViLT9M_B_Vilt_ViT_16_384_10_epoch_lr_5e-5_BS_1024_loss_focal_0.5_2_crop_0.08.yaml (in=1281) (out=590) (deflated 54%) adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxCCSBUCocoVGCap_B_Vilt_ViT_16_384_40_epoch_lr_5e-5_BS_1024_loss_focal_crop_0.08_inference.yaml (in=1352) (out=608) (deflated 55%) adding: aux_data/Jacob_config/Tagger/Jacob_Tagger_TaxOpenImagesV6_B_Vilt_ViT_16_384_10_epoch_lr_5e-5_BS_512_inference_tags.yaml (in=1272) (out=597) (deflated 53%) adding: aux_data/Jacob_config/Debug/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/Debug/vlp_uni_pipeline_distill_debub_vinvl.yaml (in=2004) (out=819) (deflated 59%) adding: aux_data/Jacob_config/Debug/vqa_uni_pipeline_debug_test.yaml (in=1505) (out=638) (deflated 58%) adding: aux_data/Jacob_config/Debug/checkpoint_uni_pipeline_debug.yaml (in=1195) (out=558) (deflated 53%) adding: aux_data/Jacob_config/Debug/vqa_uni_pipeline_debug.yaml (in=1450) (out=594) (deflated 59%) adding: aux_data/Jacob_config/Debug/distill_caption_uni_pipeline_debug.yaml (in=6083) (out=1858) (deflated 69%) adding: aux_data/Jacob_config/Debug/distill_caption_uni_pipeline_debug_multi_tower.yaml (in=2371) (out=848) (deflated 64%) adding: aux_data/Jacob_config/Debug/vqa_uni_pipeline_distill.yaml (in=1786) (out=812) (deflated 55%) adding: aux_data/Jacob_config/Debug/kim_vqa_uni_pipeline_distill_debug.yaml (in=1579) (out=734) (deflated 54%) adding: aux_data/Jacob_config/Debug/VinVL_Taxcococaption_bid_finetune.yaml (in=1208) (out=544) (deflated 55%) adding: aux_data/Jacob_config/Debug/kim_captioning.yaml (in=1516) (out=635) (deflated 58%) adding: aux_data/Jacob_config/Debug/nocaps_vilt_test.yaml (in=1247) (out=597) (deflated 52%) adding: aux_data/Jacob_config/Debug/others/ (in=0) (out=0) (stored 0%) adding: aux_data/Jacob_config/Debug/others/VLP.yaml (in=658) (out=366) (deflated 44%) adding: aux_data/Jacob_config/Debug/others/Retrieval.yaml (in=702) (out=390) (deflated 44%) adding: aux_data/Jacob_config/Debug/others/Caption.yaml (in=1086) (out=505) (deflated 53%) adding: aux_data/Jacob_config/Debug/others/Caption_test.yaml (in=719) (out=378) (deflated 47%) adding: aux_data/Jacob_config/Debug/kim_vqa_uni_pipeline_debug.yaml (in=1264) (out=574) (deflated 55%) adding: aux_data/Jacob_config/Debug/kim_vqa_uni_pipeline_debug_test.yaml (in=1260) (out=572) (deflated 55%) adding: aux_data/Jacob_config/Debug/tagger_uni_pipeline_debug.yaml (in=1944) (out=845) (deflated 57%) adding: aux_data/Jacob_config/Debug/vlp_uni_pipeline_distill_debug.yaml (in=2320) (out=858) (deflated 63%) adding: aux_data/Jacob_config/Debug/nosample_inference.yaml (in=1451) (out=621) (deflated 57%) adding: aux_data/Jacob_config/Debug/vinvl_caption_uni_pipeline_debug.yaml (in=1119) (out=507) (deflated 55%) adding: aux_data/Jacob_config/Debug/caption_uni_pipeline_debug.yaml (in=2538) (out=929) (deflated 63%) adding: aux_data/Jacob_config/Debug/caption_uni_pipeline_teacher.yaml (in=1561) (out=688) (deflated 56%) adding: aux_data/Jacob_config/Debug/vlp_uni_pipeline_debug.yaml (in=1507) (out=635) (deflated 58%) adding: aux_data/aml/ (in=0) (out=0) (stored 0%) adding: aux_data/aml/Vision_GPU/ (in=0) (out=0) (stored 0%) adding: aux_data/aml/Vision_GPU/config.json (in=137) (out=120) (deflated 12%) adding: aux_data/aml/Vision_GPU/aml.yaml (in=926) (out=477) (deflated 48%) adding: aux_data/aml/Vision_GPU/compute_target.json (in=37) (out=37) (stored 0%) adding: aux_data/aml/128V100X8/ (in=0) (out=0) (stored 0%) adding: aux_data/aml/128V100X8/config.json (in=138) (out=121) (deflated 12%) adding: aux_data/aml/128V100X8/aml.yaml (in=926) (out=483) (deflated 48%) adding: aux_data/aml/128V100X8/compute_target.json (in=37) (out=37) (stored 0%) adding: aux_data/aml/config.json (in=134) (out=118) (deflated 12%) adding: aux_data/aml/docker/ (in=0) (out=0) (stored 0%) adding: aux_data/aml/docker/Vision_GPU/ (in=0) (out=0) (stored 0%) adding: aux_data/aml/docker/Vision_GPU/config.json (in=137) (out=120) (deflated 12%) adding: aux_data/aml/docker/Vision_GPU/compute_target.json (in=37) (out=37) (stored 0%) adding: aux_data/aml/docker/pytorch1.6/ (in=0) (out=0) (stored 0%) adding: aux_data/aml/docker/pytorch1.6/environment.json (in=201) (out=133) (deflated 34%) adding: aux_data/aml/docker/pytorch1.4/ (in=0) (out=0) (stored 0%) adding: aux_data/aml/docker/pytorch1.4/environment.json (in=257) (out=152) (deflated 41%) adding: aux_data/aml/we3v32_eastus/ (in=0) (out=0) (stored 0%) adding: aux_data/aml/we3v32_eastus/config.json (in=158) (out=121) (deflated 23%) adding: aux_data/aml/we3v32_eastus/aml.yaml (in=1026) (out=548) (deflated 47%) adding: aux_data/aml/we3v32_eastus/compute_target.json (in=42) (out=42) (stored 0%) adding: aux_data/aml/aml.yaml (in=926) (out=478) (deflated 48%) adding: aux_data/aml/others/ (in=0) (out=0) (stored 0%) adding: aux_data/aml/others/aml_test.yaml (in=2406) (out=741) (deflated 69%) adding: aux_data/aml/compute_target.json (in=37) (out=37) (stored 0%) adding: aux_data/aml/VLP32GB/ (in=0) (out=0) (stored 0%) adding: aux_data/aml/VLP32GB/config.json (in=134) (out=118) (deflated 12%) adding: aux_data/aml/VLP32GB/aml.yaml (in=926) (out=478) (deflated 48%) adding: aux_data/aml/VLP32GB/compute_target.json (in=37) (out=37) (stored 0%) adding: aux_data/aml/datablobs/ (in=0) (out=0) (stored 0%) adding: aux_data/aml/datablobs/vigeast/ (in=0) (out=0) (stored 0%) adding: aux_data/aml/datablobs/vigeast/datastore.json (in=360) (out=279) (deflated 23%) adding: aux_data/aml/datablobs/vig/ (in=0) (out=0) (stored 0%) adding: aux_data/aml/datablobs/vig/datastore.json (in=352) (out=267) (deflated 24%) adding: aux_data/aml/CustVisP100/ (in=0) (out=0) (stored 0%) adding: aux_data/aml/CustVisP100/config.json (in=138) (out=121) (deflated 12%) adding: aux_data/aml/CustVisP100/compute_target.json (in=37) (out=37) (stored 0%) adding: aux_data/aml/cluster_base.yaml (in=1868) (out=706) (deflated 62%) adding: aux_data/aml/we3v32/ (in=0) (out=0) (stored 0%) adding: aux_data/aml/we3v32/config.json (in=159) (out=122) (deflated 23%) adding: aux_data/aml/we3v32/aml.yaml (in=1020) (out=546) (deflated 46%) adding: aux_data/aml/we3v32/compute_target.json (in=43) (out=43) (stored 0%) adding: aux_data/aml/CustVis32GB/ (in=0) (out=0) (stored 0%) adding: aux_data/aml/CustVis32GB/config.json (in=138) (out=121) (deflated 12%) adding: aux_data/aml/CustVis32GB/aml.yaml (in=926) (out=478) (deflated 48%) adding: aux_data/aml/CustVis32GB/compute_target.json (in=37) (out=37) (stored 0%) adding: aux_data/untrained_config/ (in=0) (out=0) (stored 0%) adding: aux_data/untrained_config/VILT-L12-H784-uncased_16_224/ (in=0) (out=0) (stored 0%) adding: aux_data/untrained_config/VILT-L12-H784-uncased_16_224/config.json (in=570) (out=287) (deflated 50%) adding: aux_data/untrained_config/VILT-L12-H784-uncased_16_224/VILT-L12-H784-uncased-vocab-nlg.txt (in=231478) (out=109832) (deflated 53%) adding: aux_data/untrained_config/VILT-L12-H784-uncased_16_224/vocab.txt (in=231508) (out=109776) (deflated 53%) adding: aux_data/untrained_config/VILT-L12-H784-uncased_16_384/ (in=0) (out=0) (stored 0%) adding: aux_data/untrained_config/VILT-L12-H784-uncased_16_384/config.json (in=570) (out=288) (deflated 49%) adding: aux_data/untrained_config/VILT-L12-H784-uncased_16_384/VILT-L12-H784-uncased-vocab-nlg.txt (in=231478) (out=109832) (deflated 53%) adding: aux_data/untrained_config/VILT-L12-H784-uncased_16_384/vocab.txt (in=231508) (out=109776) (deflated 53%) adding: aux_data/untrained_config/MiniLM-L12-H384-uncased-rand/ (in=0) (out=0) (stored 0%) adding: aux_data/untrained_config/MiniLM-L12-H384-uncased-rand/special_tokens_map.json (in=112) (out=67) (deflated 40%) adding: aux_data/untrained_config/MiniLM-L12-H384-uncased-rand/config.json (in=313) (out=167) (deflated 47%) adding: aux_data/untrained_config/MiniLM-L12-H384-uncased-rand/minilm-l12-h384-uncased-vocab-nlg.txt (in=231478) (out=109832) (deflated 53%) adding: aux_data/untrained_config/MiniLM-L12-H384-uncased-rand/vocab.txt (in=231508) (out=109776) (deflated 53%) adding: aux_data/untrained_config/MiniLM-L12-H384-uncased-rand/added_tokens.json (in=2) (out=2) (stored 0%) adding: aux_data/untrained_config/Oscar-L12-H784-uncased/ (in=0) (out=0) (stored 0%) adding: aux_data/untrained_config/Oscar-L12-H784-uncased/config.json (in=340) (out=180) (deflated 47%) adding: aux_data/untrained_config/Oscar-L12-H784-uncased/Oscar-L12-H784-uncased-vocab-nlg.txt (in=231478) (out=109832) (deflated 53%) adding: aux_data/untrained_config/Oscar-L12-H784-uncased/vocab.txt (in=231508) (out=109776) (deflated 53%) adding: aux_data/untrained_config/VILT-L12-H784-uncased_32_384/ (in=0) (out=0) (stored 0%) adding: aux_data/untrained_config/VILT-L12-H784-uncased_32_384/config.json (in=570) (out=288) (deflated 49%) adding: aux_data/untrained_config/VILT-L12-H784-uncased_32_384/VILT-L12-H784-uncased-vocab-nlg.txt (in=231478) (out=109832) (deflated 53%) adding: aux_data/untrained_config/VILT-L12-H784-uncased_32_384/vocab.txt (in=231508) (out=109776) (deflated 53%) adding: compile.aml.sh (in=966) (out=329) (deflated 66%) adding: dog.jpg (in=145305) (out=144651) (deflated 0%) adding: entry.py (in=227) (out=127) (deflated 44%) adding: flops.pdf (in=28598) (out=21980) (deflated 23%) adding: images/ (in=0) (out=0) (stored 0%) adding: mask_output (in=116) (out=116) (stored 0%) adding: models (in=121) (out=121) (stored 0%) adding: requirements.txt (in=627) (out=386) (deflated 38%) adding: scripts/ (in=0) (out=0) (stored 0%) adding: scripts/irisextract.py (in=2337) (out=940) (deflated 60%) adding: scripts/mergebn2.py (in=4335) (out=1417) (deflated 67%) adding: scripts/model_initialization.py (in=5278) (out=1690) (deflated 68%) adding: scripts/qd_pytorch.py (in=0) (out=0) (stored 0%) adding: scripts/model_inference.py (in=4962) (out=1837) (deflated 63%) adding: scripts/torch_from_imagenet.py (in=11565) (out=3238) (deflated 72%) adding: scripts/taxonomy.py (in=29618) (out=7273) (deflated 75%) adding: scripts/trainrpn.py (in=6746) (out=2443) (deflated 64%) adding: scripts/gen_rpnprototxt.py (in=8567) (out=1988) (deflated 77%) adding: scripts/qd_const.py (in=108) (out=63) (deflated 42%) adding: scripts/wt_stats.py (in=3271) (out=1299) (deflated 60%) adding: scripts/share.py (in=1434) (out=512) (deflated 64%) adding: scripts/cocoeval.py (in=4618) (out=1454) (deflated 69%) adding: scripts/qd_maskrcnn.py (in=36431) (out=8161) (deflated 78%) adding: scripts/prepare_voc.py (in=6890) (out=2109) (deflated 69%) adding: scripts/convert_to_tsv.py (in=12824) (out=3892) (deflated 70%) adding: scripts/eval.py (in=3897) (out=1318) (deflated 66%) adding: scripts/process_image.py (in=5976) (out=1793) (deflated 70%) adding: scripts/ssddet.py (in=8743) (out=3129) (deflated 64%) adding: scripts/torch_transfer_learning.py (in=4888) (out=1611) (deflated 67%) adding: scripts/a.py (in=1309) (out=495) (deflated 62%) adding: scripts/process_tsv.py (in=180050) (out=33296) (deflated 82%) adding: scripts/_init_paths.py (in=479) (out=269) (deflated 44%) adding: scripts/backup.py (in=1283) (out=519) (deflated 60%) adding: scripts/runt.py (in=54347) (out=12867) (deflated 76%) adding: scripts/print_result.py (in=3089) (out=1058) (deflated 66%) adding: scripts/__init__.py (in=0) (out=0) (stored 0%) adding: scripts/tsv_io.py (in=28353) (out=6035) (deflated 79%) adding: scripts/qd_lstm.py (in=8710) (out=3402) (deflated 61%) adding: scripts/yolodet.py (in=29313) (out=7050) (deflated 76%) adding: scripts/email_util.py (in=563) (out=273) (deflated 52%) adding: scripts/setup_pyfrcn.py (in=1601) (out=737) (deflated 54%) adding: scripts/remote_run.py (in=7891) (out=2104) (deflated 73%) adding: scripts/tsvdet.py (in=9866) (out=2788) (deflated 72%) adding: scripts/test_unit.py (in=8306) (out=1729) (deflated 79%) adding: scripts/setup_caffe.py (in=1467) (out=645) (deflated 56%) adding: scripts/lineidx.py (in=581) (out=293) (deflated 50%) adding: scripts/yoloinit.py (in=24993) (out=4740) (deflated 81%) adding: scripts/deteval.py (in=39) (out=39) (stored 0%) adding: scripts/synsetizer.py (in=3706) (out=1266) (deflated 66%) adding: scripts/pytablemd.py (in=3492) (out=1236) (deflated 65%) adding: scripts/train.py (in=7527) (out=2663) (deflated 65%) adding: scripts/vis_bkg.py (in=3103) (out=1019) (deflated 67%) adding: scripts/roiextract.py (in=8593) (out=3069) (deflated 64%) adding: scripts/mergebn.py (in=4785) (out=1448) (deflated 70%) adding: scripts/tools.py (in=17660) (out=4831) (deflated 73%) adding: scripts/yoloeval.py (in=12288) (out=3866) (deflated 69%) adding: scripts/exps.py (in=64) (out=61) (deflated 5%) adding: scripts/latex_writer.py (in=34) (out=34) (stored 0%) adding: scripts/hdf5datalayer.py (in=1641) (out=669) (deflated 59%) adding: scripts/create_mnist.py (in=2610) (out=915) (deflated 65%) adding: scripts/gen_prototxt.py (in=3928) (out=1283) (deflated 67%) adding: scripts/msoftmax.py (in=40877) (out=6906) (deflated 83%) adding: scripts/q_gen_csv.py (in=9776) (out=2599) (deflated 73%) adding: scripts/iristrain.py (in=9198) (out=2972) (deflated 68%) adding: scripts/process_dataset.py (in=11721) (out=2798) (deflated 76%) adding: scripts/yolotree_init.py (in=18746) (out=4223) (deflated 77%) adding: scripts/garbage_collector.py (in=2396) (out=854) (deflated 64%) adding: scripts/drawresults.py (in=3502) (out=1271) (deflated 64%) adding: scripts/deteval_voc.py (in=6279) (out=2111) (deflated 66%) adding: scripts/wordtree.py (in=1744) (out=581) (deflated 67%) adding: scripts/demo_detection.py (in=14418) (out=3690) (deflated 74%) adding: scripts/qd_common.py (in=66902) (out=16658) (deflated 75%) adding: scripts/templatenet.py (in=335) (out=221) (deflated 34%) adding: scripts/qd_util.py (in=352217) (out=72775) (deflated 79%) adding: scripts/rpneval.py (in=4845) (out=1699) (deflated 65%) adding: src/ (in=0) (out=0) (stored 0%) adding: src/linear_attention_transformer/ (in=0) (out=0) (stored 0%) adding: src/linear_attention_transformer/autoregressive_wrapper.py (in=3575) (out=1242) (deflated 65%) adding: src/linear_attention_transformer/__init__.py (in=339) (out=130) (deflated 62%) adding: src/linear_attention_transformer/linear_attention_transformer.py (in=19083) (out=4720) (deflated 75%) adding: src/linear_attention_transformer/autopadder.py (in=2102) (out=741) (deflated 65%) adding: src/linear_attention_transformer/reversible.py (in=6104) (out=1840) (deflated 70%) adding: src/linear_attention_transformer/images.py (in=1842) (out=621) (deflated 66%) adding: src/qd/ (in=0) (out=0) (stored 0%) adding: src/qd/evaluate/ (in=0) (out=0) (stored 0%) adding: src/qd/evaluate/evaluate_openimages_google.py (in=27126) (out=6260) (deflated 77%) adding: src/qd/evaluate/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/evaluate/oid_hierarchical_labels_expansion_tsv.py (in=8680) (out=2503) (deflated 71%) adding: src/qd/qd_pytorch.py (in=131938) (out=28629) (deflated 78%) adding: src/qd/examples.py (in=679) (out=286) (deflated 58%) adding: src/qd/taxonomy.py (in=29726) (out=7349) (deflated 75%) adding: src/qd/unittest/ (in=0) (out=0) (stored 0%) adding: src/qd/unittest/test_qd_common.py (in=6192) (out=1629) (deflated 74%) adding: src/qd/unittest/test_philly.py (in=554) (out=240) (deflated 57%) adding: src/qd/unittest/test_masktsvdataset.py (in=3254) (out=933) (deflated 71%) adding: src/qd/unittest/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/unittest/test_maskrcnn.py (in=0) (out=0) (stored 0%) adding: src/qd/unittest/test_tsvdatasetdb.py (in=1754) (out=443) (deflated 75%) adding: src/qd/unittest/test_pytorch.py (in=5194) (out=1491) (deflated 71%) adding: src/qd/unittest/test_process_tsv.py (in=1272) (out=483) (deflated 62%) adding: src/qd/unittest/test_cloud_storage.py (in=708) (out=232) (deflated 67%) adding: src/qd/unittest/test_layers.py (in=1527) (out=566) (deflated 63%) adding: src/qd/unittest/test_mmtsvdataset.py (in=2574) (out=882) (deflated 66%) adding: src/qd/philly.py (in=300) (out=201) (deflated 33%) adding: src/qd/pipeline.py (in=54481) (out=12104) (deflated 78%) adding: src/qd/cocoeval.py (in=4757) (out=1490) (deflated 69%) adding: src/qd/qd_maskrcnn.py (in=49926) (out=12035) (deflated 76%) adding: src/qd/qd_yolov2pt.py (in=5183) (out=1516) (deflated 71%) adding: src/qd/acc_query.py (in=2471) (out=861) (deflated 65%) adding: src/qd/batch_process.py (in=9892) (out=2278) (deflated 77%) adding: src/qd/image_text_align.py (in=12435) (out=2801) (deflated 77%) adding: src/qd/torch_common.py (in=38968) (out=9661) (deflated 75%) adding: src/qd/process_image.py (in=10643) (out=3052) (deflated 71%) adding: src/qd/mask/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/structures/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/structures/tsv_file.py (in=2484) (out=892) (deflated 64%) adding: src/qd/mask/structures/segmentation_mask.py (in=17446) (out=3939) (deflated 77%) adding: src/qd/mask/structures/image_list.py (in=2664) (out=992) (deflated 63%) adding: src/qd/mask/structures/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/structures/__init__.py (in=0) (out=0) (stored 0%) adding: src/qd/mask/structures/boxlist_ops.py (in=6609) (out=1950) (deflated 70%) adding: src/qd/mask/structures/keypoint.py (in=6555) (out=1795) (deflated 73%) adding: src/qd/mask/structures/bounding_box.py (in=11698) (out=2887) (deflated 75%) adding: src/qd/mask/layers/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/bert/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/bert/tokenization_bert.py (in=20945) (out=5422) (deflated 74%) adding: src/qd/mask/layers/bert/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/bert/file_utils.py (in=8876) (out=2861) (deflated 68%) adding: src/qd/mask/layers/bert/modeling_outputs.py (in=36759) (out=2071) (deflated 94%) adding: src/qd/mask/layers/bert/modeling_utils.py (in=78488) (out=17476) (deflated 78%) adding: src/qd/mask/layers/bert/__init__.py (in=1436) (out=488) (deflated 66%) adding: src/qd/mask/layers/bert/modeling_bert.py (in=400457) (out=36841) (deflated 91%) adding: src/qd/mask/layers/bert/others/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/bert/others/modeling_bert.py (in=458281) (out=36277) (deflated 92%) adding: src/qd/mask/layers/bert/activations.py (in=1723) (out=707) (deflated 59%) adding: src/qd/mask/layers/bert/modeling_mobilebert.py (in=69244) (out=12979) (deflated 81%) adding: src/qd/mask/layers/bert/tokenization_utils.py (in=21245) (out=5003) (deflated 76%) adding: src/qd/mask/layers/others_bert/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/others_bert/tokenization_bert.py (in=19521) (out=5148) (deflated 74%) adding: src/qd/mask/layers/others_bert/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/others_bert/file_utils.py (in=8876) (out=2861) (deflated 68%) adding: src/qd/mask/layers/others_bert/modeling_vilt.py (in=51587) (out=11246) (deflated 78%) adding: src/qd/mask/layers/others_bert/modeling_outputs.py (in=36759) (out=2071) (deflated 94%) adding: src/qd/mask/layers/others_bert/modeling_utils.py (in=77361) (out=17172) (deflated 78%) adding: src/qd/mask/layers/others_bert/__init__.py (in=494) (out=263) (deflated 47%) adding: src/qd/mask/layers/others_bert/modeling_bert.py (in=226036) (out=24196) (deflated 89%) adding: src/qd/mask/layers/others_bert/fig/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/others_bert/fig/architecture_v2.pdf (in=143473) (out=130471) (deflated 9%) adding: src/qd/mask/layers/others_bert/fig/acc.eps (in=35077) (out=11654) (deflated 67%) adding: src/qd/mask/layers/others_bert/fig/params.pdf (in=15495) (out=11682) (deflated 25%) adding: src/qd/mask/layers/others_bert/fig/flops.pdf (in=18494) (out=14218) (deflated 23%) adding: src/qd/mask/layers/others_bert/fig/docProps/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/others_bert/fig/docProps/core.xml (in=691) (out=353) (deflated 49%) adding: src/qd/mask/layers/others_bert/fig/docProps/thumbnail.jpeg (in=9569) (out=9019) (deflated 6%) adding: src/qd/mask/layers/others_bert/fig/docProps/app.xml (in=1353) (out=532) (deflated 61%) adding: src/qd/mask/layers/others_bert/fig/flops.eps (in=30629) (out=11224) (deflated 63%) adding: src/qd/mask/layers/others_bert/fig/params.eps (in=26204) (out=9257) (deflated 65%) adding: src/qd/mask/layers/others_bert/fig/vqa.pdf (in=17354) (out=13292) (deflated 23%) adding: src/qd/mask/layers/others_bert/fig/architecture_v3.pdf (in=145597) (out=130418) (deflated 10%) adding: src/qd/mask/layers/others_bert/fig/docMetadata/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/others_bert/fig/docMetadata/LabelInfo.xml (in=323) (out=238) (deflated 26%) adding: src/qd/mask/layers/others_bert/fig/architecture1.pdf (in=117185) (out=104869) (deflated 11%) adding: src/qd/mask/layers/others_bert/fig/cider.pdf (in=14564) (out=10997) (deflated 24%) adding: src/qd/mask/layers/others_bert/fig/bert.pdf (in=19385) (out=14670) (deflated 24%) adding: src/qd/mask/layers/others_bert/fig/vqa.eps (in=27463) (out=10284) (deflated 63%) adding: src/qd/mask/layers/others_bert/fig/vary_detector.eps (in=40582) (out=12895) (deflated 68%) adding: src/qd/mask/layers/others_bert/fig/arch-crop.pdf (in=139013) (out=134787) (deflated 3%) adding: src/qd/mask/layers/others_bert/fig/vary_detector.pdf (in=22581) (out=17984) (deflated 20%) adding: src/qd/mask/layers/others_bert/fig/cider.eps (in=22108) (out=8379) (deflated 62%) adding: src/qd/mask/layers/others_bert/fig/arch.pdf (in=151742) (out=139606) (deflated 8%) adding: src/qd/mask/layers/others_bert/fig/param.pdf (in=15572) (out=12009) (deflated 23%) adding: src/qd/mask/layers/others_bert/fig/ppt/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/others_bert/fig/ppt/changesInfos/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/others_bert/fig/ppt/changesInfos/changesInfo1.xml (in=15571) (out=2242) (deflated 86%) adding: src/qd/mask/layers/others_bert/fig/ppt/slides/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/others_bert/fig/ppt/slides/slide1.xml (in=99995) (out=8894) (deflated 91%) adding: src/qd/mask/layers/others_bert/fig/ppt/slides/_rels/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/others_bert/fig/ppt/slides/_rels/slide1.xml.rels (in=1665) (out=274) (deflated 84%) adding: src/qd/mask/layers/others_bert/fig/ppt/presentation.xml (in=3212) (out=544) (deflated 83%) adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/slideLayout11.xml (in=4200) (out=1168) (deflated 72%) adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/slideLayout3.xml (in=5442) (out=1302) (deflated 76%) adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/slideLayout7.xml (in=2550) (out=892) (deflated 65%) adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/slideLayout4.xml (in=4975) (out=1184) (deflated 76%) adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/slideLayout1.xml (in=4678) (out=1250) (deflated 73%) adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/slideLayout6.xml (in=3064) (out=969) (deflated 68%) adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/slideLayout2.xml (in=3921) (out=1080) (deflated 72%) adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/slideLayout10.xml (in=3976) (out=1114) (deflated 72%) adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/slideLayout8.xml (in=5952) (out=1421) (deflated 76%) adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/slideLayout5.xml (in=7938) (out=1512) (deflated 81%) adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/slideLayout9.xml (in=5899) (out=1367) (deflated 77%) adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/_rels/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/_rels/slideLayout8.xml.rels (in=311) (out=182) (deflated 41%) adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/_rels/slideLayout6.xml.rels (in=311) (out=182) (deflated 41%) adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/_rels/slideLayout3.xml.rels (in=311) (out=182) (deflated 41%) adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/_rels/slideLayout11.xml.rels (in=311) (out=182) (deflated 41%) adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/_rels/slideLayout5.xml.rels (in=311) (out=182) (deflated 41%) adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/_rels/slideLayout10.xml.rels (in=311) (out=182) (deflated 41%) adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/_rels/slideLayout2.xml.rels (in=311) (out=182) (deflated 41%) adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/_rels/slideLayout1.xml.rels (in=311) (out=182) (deflated 41%) adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/_rels/slideLayout4.xml.rels (in=311) (out=182) (deflated 41%) adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/_rels/slideLayout7.xml.rels (in=311) (out=182) (deflated 41%) adding: src/qd/mask/layers/others_bert/fig/ppt/slideLayouts/_rels/slideLayout9.xml.rels (in=311) (out=182) (deflated 41%) adding: src/qd/mask/layers/others_bert/fig/ppt/tableStyles.xml (in=182) (out=165) (deflated 9%) adding: src/qd/mask/layers/others_bert/fig/ppt/presProps.xml (in=964) (out=443) (deflated 54%) adding: src/qd/mask/layers/others_bert/fig/ppt/theme/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/others_bert/fig/ppt/theme/theme1.xml (in=8399) (out=1692) (deflated 80%) adding: src/qd/mask/layers/others_bert/fig/ppt/viewProps.xml (in=812) (out=382) (deflated 53%) adding: src/qd/mask/layers/others_bert/fig/ppt/revisionInfo.xml (in=429) (out=270) (deflated 37%) adding: src/qd/mask/layers/others_bert/fig/ppt/media/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/others_bert/fig/ppt/media/image2.png (in=792) (out=648) (deflated 18%) adding: src/qd/mask/layers/others_bert/fig/ppt/media/image5.png (in=549) (out=464) (deflated 15%) adding: src/qd/mask/layers/others_bert/fig/ppt/media/image1.jpeg (in=127146) (out=126948) (deflated 0%) adding: src/qd/mask/layers/others_bert/fig/ppt/media/image9.png (in=721) (out=634) (deflated 12%) adding: src/qd/mask/layers/others_bert/fig/ppt/media/image4.png (in=114842) (out=114862) (deflated 0%) adding: src/qd/mask/layers/others_bert/fig/ppt/media/image6.png (in=596) (out=514) (deflated 14%) adding: src/qd/mask/layers/others_bert/fig/ppt/media/image7.png (in=594) (out=507) (deflated 15%) adding: src/qd/mask/layers/others_bert/fig/ppt/media/image3.png (in=78460) (out=78475) (deflated 0%) adding: src/qd/mask/layers/others_bert/fig/ppt/media/image10.png (in=771) (out=688) (deflated 11%) adding: src/qd/mask/layers/others_bert/fig/ppt/media/image8.png (in=643) (out=558) (deflated 13%) adding: src/qd/mask/layers/others_bert/fig/ppt/slideMasters/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/others_bert/fig/ppt/slideMasters/slideMaster1.xml (in=13876) (out=2008) (deflated 86%) adding: src/qd/mask/layers/others_bert/fig/ppt/slideMasters/_rels/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/others_bert/fig/ppt/slideMasters/_rels/slideMaster1.xml.rels (in=1991) (out=271) (deflated 86%) adding: src/qd/mask/layers/others_bert/fig/ppt/_rels/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/others_bert/fig/ppt/_rels/presentation.xml.rels (in=1246) (out=321) (deflated 74%) adding: src/qd/mask/layers/others_bert/fig/acc.pdf (in=19350) (out=14583) (deflated 25%) adding: src/qd/mask/layers/others_bert/fig/_Content_Types_.xml (in=3528) (out=499) (deflated 86%) adding: src/qd/mask/layers/others_bert/fig/_rels/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/others_bert/fig/bert.eps (in=35521) (out=11205) (deflated 68%) adding: src/qd/mask/layers/others_bert/activations.py (in=1723) (out=707) (deflated 59%) adding: src/qd/mask/layers/others_bert/nce_modeling_bert.py (in=97757) (out=17827) (deflated 82%) adding: src/qd/mask/layers/others_bert/modeling_mobilebert.py (in=69244) (out=12979) (deflated 81%) adding: src/qd/mask/layers/others_bert/tokenization_utils.py (in=20106) (out=4825) (deflated 76%) adding: src/qd/mask/layers/clip/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/clip/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/clip/__init__.py (in=1) (out=1) (stored 0%) adding: src/qd/mask/layers/clip/simple_tokenizer.py (in=4632) (out=1724) (deflated 63%) adding: src/qd/mask/layers/clip/model.py (in=27045) (out=5540) (deflated 80%) adding: src/qd/mask/layers/clip/clip.py (in=8363) (out=3075) (deflated 63%) adding: src/qd/mask/layers/scale.py (in=270) (out=164) (deflated 39%) adding: src/qd/mask/layers/dcn/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/dcn/deform_conv_func.py (in=8496) (out=1665) (deflated 80%) adding: src/qd/mask/layers/dcn/deform_pool_func.py (in=2648) (out=695) (deflated 74%) adding: src/qd/mask/layers/dcn/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/dcn/__init__.py (in=102) (out=88) (deflated 14%) adding: src/qd/mask/layers/dcn/deform_conv_module.py (in=6076) (out=1163) (deflated 81%) adding: src/qd/mask/layers/dcn/deform_pool_module.py (in=6307) (out=784) (deflated 88%) adding: src/qd/mask/layers/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/__init__.py (in=1442) (out=445) (deflated 69%) adding: src/qd/mask/layers/_utils.py (in=1165) (out=470) (deflated 60%) adding: src/qd/mask/layers/batch_norm.py (in=1091) (out=438) (deflated 60%) adding: src/qd/mask/layers/vilt/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/vilt/transforms/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/vilt/transforms/utils.py (in=1645) (out=666) (deflated 60%) adding: src/qd/mask/layers/vilt/transforms/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/vilt/transforms/__init__.py (in=301) (out=148) (deflated 51%) adding: src/qd/mask/layers/vilt/transforms/randaug.py (in=7025) (out=1962) (deflated 72%) adding: src/qd/mask/layers/vilt/transforms/pixelbert.py (in=765) (out=279) (deflated 64%) adding: src/qd/mask/layers/vilt/config.py (in=6216) (out=1343) (deflated 78%) adding: src/qd/mask/layers/vilt/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/vilt/__init__.py (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/vilt/datamodules/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/vilt/datamodules/datamodule_base.py (in=5637) (out=1072) (deflated 81%) adding: src/qd/mask/layers/vilt/datamodules/sbu_datamodule.py (in=375) (out=184) (deflated 51%) adding: src/qd/mask/layers/vilt/datamodules/vqav2_datamodule.py (in=1413) (out=488) (deflated 65%) adding: src/qd/mask/layers/vilt/datamodules/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/vilt/datamodules/multitask_datamodule.py (in=2712) (out=662) (deflated 76%) adding: src/qd/mask/layers/vilt/datamodules/__init__.py (in=707) (out=221) (deflated 69%) adding: src/qd/mask/layers/vilt/datamodules/nlvr2_datamodule.py (in=362) (out=184) (deflated 49%) adding: src/qd/mask/layers/vilt/datamodules/coco_caption_karpathy_datamodule.py (in=496) (out=200) (deflated 60%) adding: src/qd/mask/layers/vilt/datamodules/conceptual_caption_datamodule.py (in=396) (out=185) (deflated 53%) adding: src/qd/mask/layers/vilt/datamodules/f30k_caption_karpathy_datamodule.py (in=496) (out=204) (deflated 59%) adding: src/qd/mask/layers/vilt/datamodules/vg_caption_datamodule.py (in=401) (out=189) (deflated 53%) adding: src/qd/mask/layers/vilt/datasets/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/vilt/datasets/vg_caption_dataset.py (in=506) (out=246) (deflated 51%) adding: src/qd/mask/layers/vilt/datasets/f30k_caption_karpathy_dataset.py (in=615) (out=266) (deflated 57%) adding: src/qd/mask/layers/vilt/datasets/sbu_caption_dataset.py (in=543) (out=267) (deflated 51%) adding: src/qd/mask/layers/vilt/datasets/vqav2_dataset.py (in=1480) (out=481) (deflated 68%) adding: src/qd/mask/layers/vilt/datasets/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/vilt/datasets/conceptual_caption_dataset.py (in=598) (out=284) (deflated 53%) adding: src/qd/mask/layers/vilt/datasets/__init__.py (in=395) (out=148) (deflated 63%) adding: src/qd/mask/layers/vilt/datasets/nlvr2_dataset.py (in=1610) (out=569) (deflated 65%) adding: src/qd/mask/layers/vilt/datasets/base_dataset.py (in=8904) (out=2283) (deflated 74%) adding: src/qd/mask/layers/vilt/datasets/coco_caption_karpathy_dataset.py (in=970) (out=389) (deflated 60%) adding: src/qd/mask/layers/vilt/modules/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/vilt/modules/heads.py (in=1569) (out=443) (deflated 72%) adding: src/qd/mask/layers/vilt/modules/vision_transformer.py (in=49230) (out=8865) (deflated 82%) adding: src/qd/mask/layers/vilt/modules/vilt_module.py (in=4869) (out=1406) (deflated 71%) adding: src/qd/mask/layers/vilt/modules/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/vilt/modules/__init__.py (in=73) (out=66) (deflated 10%) adding: src/qd/mask/layers/vilt/modules/dist_utils.py (in=7814) (out=2322) (deflated 70%) adding: src/qd/mask/layers/vilt/modules/vilt_utils.py (in=10912) (out=1854) (deflated 83%) adding: src/qd/mask/layers/vilt/modules/objectives.py (in=22049) (out=5015) (deflated 77%) adding: src/qd/mask/layers/vilt/gadgets/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/vilt/gadgets/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/vilt/gadgets/my_metrics.py (in=2359) (out=573) (deflated 76%) adding: src/qd/mask/layers/vilt/gadgets/__init__.py (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/vilt/utils/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/layers/vilt/utils/write_coco_karpathy.py (in=1904) (out=745) (deflated 61%) adding: src/qd/mask/layers/vilt/utils/write_f30k_karpathy.py (in=1871) (out=736) (deflated 61%) adding: src/qd/mask/layers/vilt/utils/glossary.py (in=4435) (out=1230) (deflated 72%) adding: src/qd/mask/layers/vilt/utils/write_vqa.py (in=6523) (out=1678) (deflated 74%) adding: src/qd/mask/layers/vilt/utils/write_sbu.py (in=1785) (out=712) (deflated 60%) adding: src/qd/mask/layers/vilt/utils/write_conceptual_caption.py (in=2037) (out=761) (deflated 63%) adding: src/qd/mask/layers/vilt/utils/write_nlvr2.py (in=2818) (out=851) (deflated 70%) adding: src/qd/mask/layers/vilt/utils/write_vg.py (in=1928) (out=754) (deflated 61%) adding: src/qd/mask/layers/misc.py (in=6021) (out=1660) (deflated 72%) adding: src/qd/mask/layers/nms.py (in=618) (out=317) (deflated 49%) adding: src/qd/mask/layers/roi_align.py (in=2142) (out=643) (deflated 70%) adding: src/qd/mask/layers/iou_loss.py (in=1961) (out=649) (deflated 67%) adding: src/qd/mask/layers/sigmoid_focal_loss.py (in=2374) (out=776) (deflated 67%) adding: src/qd/mask/layers/roi_pool.py (in=1887) (out=609) (deflated 68%) adding: src/qd/mask/layers/smooth_l1_loss.py (in=481) (out=293) (deflated 39%) adding: src/qd/mask/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/__init__.py (in=0) (out=0) (stored 0%) adding: src/qd/mask/data/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/data/samplers/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/data/samplers/iteration_based_batch_sampler.py (in=1164) (out=456) (deflated 61%) adding: src/qd/mask/data/samplers/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/data/samplers/__init__.py (in=328) (out=170) (deflated 48%) adding: src/qd/mask/data/samplers/grouped_batch_sampler.py (in=4845) (out=1645) (deflated 66%) adding: src/qd/mask/data/samplers/distributed.py (in=3754) (out=1327) (deflated 65%) adding: src/qd/mask/data/transforms/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/data/transforms/build.py (in=5207) (out=1163) (deflated 78%) adding: src/qd/mask/data/transforms/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/data/transforms/__init__.py (in=284) (out=149) (deflated 48%) adding: src/qd/mask/data/transforms/transforms.py (in=3085) (out=888) (deflated 71%) adding: src/qd/mask/data/build.py (in=7324) (out=2393) (deflated 67%) adding: src/qd/mask/data/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/data/__init__.py (in=108) (out=100) (deflated 7%) adding: src/qd/mask/data/collate_batch.py (in=1080) (out=431) (deflated 60%) adding: src/qd/mask/data/datasets/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/data/datasets/concat_dataset.py (in=766) (out=323) (deflated 58%) adding: src/qd/mask/data/datasets/list_dataset.py (in=936) (out=444) (deflated 53%) adding: src/qd/mask/data/datasets/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/data/datasets/__init__.py (in=458) (out=213) (deflated 53%) adding: src/qd/mask/data/datasets/masktsvdataset.py (in=13707) (out=2874) (deflated 79%) adding: src/qd/mask/data/datasets/caption_tsv.py (in=21070) (out=3933) (deflated 81%) adding: src/qd/mask/data/datasets/caption_tensorizer.py (in=65072) (out=6209) (deflated 90%) adding: src/qd/mask/data/datasets/coco.py (in=3616) (out=1265) (deflated 65%) adding: src/qd/mask/data/datasets/evaluation/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/data/datasets/evaluation/voc/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/data/datasets/evaluation/voc/__init__.py (in=505) (out=240) (deflated 52%) adding: src/qd/mask/data/datasets/evaluation/voc/voc_eval.py (in=8085) (out=2452) (deflated 70%) adding: src/qd/mask/data/datasets/evaluation/__init__.py (in=994) (out=429) (deflated 57%) adding: src/qd/mask/data/datasets/evaluation/coco/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/data/datasets/evaluation/coco/__init__.py (in=494) (out=171) (deflated 65%) adding: src/qd/mask/data/datasets/evaluation/coco/coco_eval.py (in=13715) (out=3680) (deflated 73%) adding: src/qd/mask/data/datasets/utils/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/data/datasets/utils/utils_glue.py (in=37206) (out=5527) (deflated 85%) adding: src/qd/mask/data/datasets/utils/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/data/datasets/utils/config_args.py (in=3111) (out=1010) (deflated 68%) adding: src/qd/mask/data/datasets/utils/image_ops.py (in=327) (out=192) (deflated 41%) adding: src/qd/mask/data/datasets/utils/box_label_loader.py (in=4775) (out=1358) (deflated 72%) adding: src/qd/mask/data/datasets/utils/load_files.py (in=2290) (out=653) (deflated 71%) adding: src/qd/mask/data/datasets/voc.py (in=4114) (out=1407) (deflated 66%) adding: src/qd/mask/data/README.md (in=2763) (out=1014) (deflated 63%) adding: src/qd/mask/solver/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/solver/lr_scheduler.py (in=3423) (out=857) (deflated 75%) adding: src/qd/mask/solver/build.py (in=3373) (out=1088) (deflated 68%) adding: src/qd/mask/solver/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/solver/__init__.py (in=476) (out=210) (deflated 56%) adding: src/qd/mask/solver/LARC.py (in=3976) (out=1277) (deflated 68%) adding: src/qd/mask/solver/optimization.py (in=9530) (out=2563) (deflated 73%) adding: src/qd/mask/config/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/config/defaults.py (in=21957) (out=6419) (deflated 71%) adding: src/qd/mask/config/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/config/paths_catalog.py (in=8606) (out=2026) (deflated 76%) adding: src/qd/mask/config/__init__.py (in=139) (out=105) (deflated 24%) adding: src/qd/mask/utils/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/utils/transforms/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/utils/transforms/build.py (in=5207) (out=1163) (deflated 78%) adding: src/qd/mask/utils/transforms/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/utils/transforms/__init__.py (in=284) (out=149) (deflated 48%) adding: src/qd/mask/utils/transforms/transforms.py (in=3085) (out=888) (deflated 71%) adding: src/qd/mask/utils/model_serialization.py (in=4024) (out=1492) (deflated 63%) adding: src/qd/mask/utils/timer.py (in=1127) (out=414) (deflated 63%) adding: src/qd/mask/utils/checkpoint.py (in=5536) (out=1552) (deflated 72%) adding: src/qd/mask/utils/miscellaneous.py (in=228) (out=160) (deflated 30%) adding: src/qd/mask/utils/build.py (in=7324) (out=2393) (deflated 67%) adding: src/qd/mask/utils/metric_logger.py (in=2254) (out=787) (deflated 65%) adding: src/qd/mask/utils/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/utils/__init__.py (in=0) (out=0) (stored 0%) adding: src/qd/mask/utils/collate_batch.py (in=1080) (out=431) (deflated 60%) adding: src/qd/mask/utils/imports.py (in=843) (out=382) (deflated 55%) adding: src/qd/mask/utils/logger.py (in=783) (out=354) (deflated 55%) adding: src/qd/mask/utils/cv2_util.py (in=640) (out=289) (deflated 55%) adding: src/qd/mask/utils/collect_env.py (in=338) (out=203) (deflated 40%) adding: src/qd/mask/utils/c2_model_loading.py (in=8514) (out=2112) (deflated 75%) adding: src/qd/mask/utils/model_zoo.py (in=3031) (out=1292) (deflated 57%) adding: src/qd/mask/utils/comm.py (in=3804) (out=1302) (deflated 66%) adding: src/qd/mask/utils/README.md (in=175) (out=119) (deflated 32%) adding: src/qd/mask/utils/registry.py (in=1385) (out=537) (deflated 61%) adding: src/qd/mask/utils/env.py (in=1249) (out=522) (deflated 58%) adding: src/qd/mask/modeling/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/poolers.py (in=4821) (out=1707) (deflated 65%) adding: src/qd/mask/modeling/matcher.py (in=5268) (out=1668) (deflated 68%) adding: src/qd/mask/modeling/utils.py (in=400) (out=247) (deflated 38%) adding: src/qd/mask/modeling/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/detector/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/detector/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/detector/__init__.py (in=117) (out=104) (deflated 11%) adding: src/qd/mask/modeling/detector/generalized_rcnn.py (in=2766) (out=1036) (deflated 63%) adding: src/qd/mask/modeling/detector/detectors.py (in=324) (out=207) (deflated 36%) adding: src/qd/mask/modeling/box_coder.py (in=3367) (out=1011) (deflated 70%) adding: src/qd/mask/modeling/make_layers.py (in=3768) (out=1195) (deflated 68%) adding: src/qd/mask/modeling/__init__.py (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/backbone/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/backbone/mobilenet.py (in=4610) (out=1304) (deflated 72%) adding: src/qd/mask/modeling/backbone/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/backbone/resnet.py (in=15687) (out=3361) (deflated 79%) adding: src/qd/mask/modeling/backbone/fpn.py (in=3906) (out=1261) (deflated 68%) adding: src/qd/mask/modeling/backbone/__init__.py (in=129) (out=105) (deflated 19%) adding: src/qd/mask/modeling/backbone/fbnet.py (in=7824) (out=2129) (deflated 73%) adding: src/qd/mask/modeling/backbone/fbnet_builder.py (in=24950) (out=4940) (deflated 80%) adding: src/qd/mask/modeling/backbone/fbnet_modeldef.py (in=5985) (out=857) (deflated 86%) adding: src/qd/mask/modeling/backbone/backbone.py (in=4442) (out=931) (deflated 79%) adding: src/qd/mask/modeling/utils_caption_evaluate.py (in=14409) (out=4505) (deflated 69%) adding: src/qd/mask/modeling/rpn/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/rpn/learnable_anchor_generator.py (in=6232) (out=2144) (deflated 66%) adding: src/qd/mask/modeling/rpn/rpn.py (in=7958) (out=2104) (deflated 74%) adding: src/qd/mask/modeling/rpn/utils.py (in=1679) (out=634) (deflated 62%) adding: src/qd/mask/modeling/rpn/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/rpn/__init__.py (in=101) (out=96) (deflated 5%) adding: src/qd/mask/modeling/rpn/loss.py (in=10613) (out=2871) (deflated 73%) adding: src/qd/mask/modeling/rpn/anchor_generator.py (in=10241) (out=3158) (deflated 69%) adding: src/qd/mask/modeling/rpn/retinanet/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/rpn/retinanet/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/rpn/retinanet/__init__.py (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/rpn/retinanet/loss.py (in=3435) (out=1046) (deflated 70%) adding: src/qd/mask/modeling/rpn/retinanet/inference.py (in=6881) (out=1978) (deflated 71%) adding: src/qd/mask/modeling/rpn/retinanet/retinanet.py (in=5293) (out=1554) (deflated 71%) adding: src/qd/mask/modeling/rpn/inference.py (in=10497) (out=2689) (deflated 74%) adding: src/qd/mask/modeling/rpn/fcos/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/rpn/fcos/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/rpn/fcos/__init__.py (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/rpn/fcos/loss.py (in=11284) (out=2989) (deflated 74%) adding: src/qd/mask/modeling/rpn/fcos/fcos.py (in=7492) (out=1982) (deflated 74%) adding: src/qd/mask/modeling/rpn/fcos/inference.py (in=6826) (out=1886) (deflated 72%) adding: src/qd/mask/modeling/balanced_positive_negative_sampler.py (in=4221) (out=1174) (deflated 72%) adding: src/qd/mask/modeling/roi_heads/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/roi_heads/mask_head/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/roi_heads/mask_head/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/roi_heads/mask_head/__init__.py (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/roi_heads/mask_head/loss.py (in=5339) (out=1749) (deflated 67%) adding: src/qd/mask/modeling/roi_heads/mask_head/mask_head.py (in=3133) (out=1076) (deflated 66%) adding: src/qd/mask/modeling/roi_heads/mask_head/roi_mask_feature_extractors.py (in=2481) (out=898) (deflated 64%) adding: src/qd/mask/modeling/roi_heads/mask_head/roi_mask_predictors.py (in=2208) (out=673) (deflated 70%) adding: src/qd/mask/modeling/roi_heads/mask_head/inference.py (in=6549) (out=2359) (deflated 64%) adding: src/qd/mask/modeling/roi_heads/attribute_head/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/roi_heads/attribute_head/roi_attribute_feature_extractors.py (in=734) (out=312) (deflated 57%) adding: src/qd/mask/modeling/roi_heads/attribute_head/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/roi_heads/attribute_head/__init__.py (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/roi_heads/attribute_head/loss.py (in=1987) (out=781) (deflated 61%) adding: src/qd/mask/modeling/roi_heads/attribute_head/inference.py (in=4210) (out=1517) (deflated 64%) adding: src/qd/mask/modeling/roi_heads/attribute_head/attribute_head.py (in=4510) (out=1440) (deflated 68%) adding: src/qd/mask/modeling/roi_heads/attribute_head/roi_attribute_predictors.py (in=3279) (out=762) (deflated 77%) adding: src/qd/mask/modeling/roi_heads/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/roi_heads/__init__.py (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/roi_heads/keypoint_head/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/roi_heads/keypoint_head/roi_keypoint_feature_extractors.py (in=1871) (out=714) (deflated 62%) adding: src/qd/mask/modeling/roi_heads/keypoint_head/keypoint_head.py (in=2057) (out=658) (deflated 68%) adding: src/qd/mask/modeling/roi_heads/keypoint_head/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/roi_heads/keypoint_head/__init__.py (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/roi_heads/keypoint_head/loss.py (in=7055) (out=2143) (deflated 70%) adding: src/qd/mask/modeling/roi_heads/keypoint_head/inference.py (in=4454) (out=1649) (deflated 63%) adding: src/qd/mask/modeling/roi_heads/keypoint_head/roi_keypoint_predictors.py (in=1259) (out=526) (deflated 58%) adding: src/qd/mask/modeling/roi_heads/box_head/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/roi_heads/box_head/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/roi_heads/box_head/__init__.py (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/roi_heads/box_head/loss.py (in=14045) (out=3361) (deflated 76%) adding: src/qd/mask/modeling/roi_heads/box_head/box_head.py (in=3596) (out=1121) (deflated 69%) adding: src/qd/mask/modeling/roi_heads/box_head/inference.py (in=20277) (out=4499) (deflated 78%) adding: src/qd/mask/modeling/roi_heads/box_head/roi_box_predictors.py (in=3860) (out=1031) (deflated 73%) adding: src/qd/mask/modeling/roi_heads/box_head/roi_box_feature_extractors.py (in=5830) (out=1382) (deflated 76%) adding: src/qd/mask/modeling/roi_heads/roi_heads.py (in=4280) (out=1069) (deflated 75%) adding: src/qd/mask/modeling/captioning/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/captioning/utils_data.py (in=3109) (out=915) (deflated 71%) adding: src/qd/mask/modeling/captioning/utils_cbs.py (in=40304) (out=10428) (deflated 74%) adding: src/qd/mask/modeling/captioning/utils.py (in=1670) (out=540) (deflated 68%) adding: src/qd/mask/modeling/captioning/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/captioning/__init__.py (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/captioning/cider/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/captioning/cider/params.json (in=151) (out=100) (deflated 34%) adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/eval.py (in=1426) (out=467) (deflated 67%) adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/__init__.py (in=21) (out=21) (stored 0%) adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/cider/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/cider/__init__.py (in=21) (out=21) (stored 0%) adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/cider/cider_scorer.py (in=8234) (out=2585) (deflated 69%) adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/cider/cider.py (in=1890) (out=822) (deflated 57%) adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/tokenizer/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/tokenizer/tmpql9uU7 (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/tokenizer/ptbtokenizer.py (in=3889) (out=1240) (deflated 68%) adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/tokenizer/stanford-corenlp-3.4.1.jar (in=5921410) (out=5400206) (deflated 9%) adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/tokenizer/tmpzNW4I2 (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/tokenizer/tmpBF49XX (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/tokenizer/__init__.py (in=21) (out=21) (stored 0%) adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/tokenizer/tmpxAmV_C (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/tokenizer/tmpuCp_T0 (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/ciderD/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/ciderD/ciderD_scorer.py (in=8860) (out=2746) (deflated 69%) adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/ciderD/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/ciderD/__init__.py (in=21) (out=21) (stored 0%) adding: src/qd/mask/modeling/captioning/cider/pyciderevalcap/ciderD/ciderD.py (in=1968) (out=868) (deflated 56%) adding: src/qd/mask/modeling/captioning/cider/pydataformat/ (in=0) (out=0) (stored 0%) adding: src/qd/mask/modeling/captioning/cider/pydataformat/__init__.py (in=20) (out=20) (stored 0%) adding: src/qd/mask/modeling/captioning/cider/pydataformat/loadData.py (in=883) (out=377) (deflated 57%) adding: src/qd/mask/modeling/captioning/cider/pydataformat/jsonify_refs.py (in=1159) (out=480) (deflated 59%) adding: src/qd/mask/modeling/captioning/cider/cidereval.ipynb (in=3034) (out=896) (deflated 70%) adding: src/qd/mask/modeling/captioning/cider/license.txt (in=1561) (out=850) (deflated 46%) adding: src/qd/mask/modeling/captioning/cider/README.md (in=2738) (out=1255) (deflated 54%) adding: src/qd/mask/modeling/captioning/cider/cidereval.py (in=1356) (out=585) (deflated 57%) adding: src/qd/mask/modeling/captioning/captioning_e2e.py (in=9777) (out=2724) (deflated 72%) adding: src/qd/mask/modeling/captioning/utils_caption_evaluate.py (in=14512) (out=4578) (deflated 68%) adding: src/qd/mask/modeling/captioning/scan_utils.py (in=18086) (out=4385) (deflated 76%) adding: src/qd/mask/modeling/captioning/utils_solver.py (in=1241) (out=428) (deflated 66%) adding: src/qd/mask/modeling/captioning/scan.py (in=13928) (out=3532) (deflated 75%) adding: src/qd/mask/modeling/registry.py (in=476) (out=201) (deflated 58%) adding: src/qd/db.py (in=23452) (out=5174) (deflated 78%) adding: src/qd/layers/ (in=0) (out=0) (stored 0%) adding: src/qd/layers/kl_entropy.py (in=1598) (out=608) (deflated 62%) adding: src/qd/layers/reshape_batch_norm.py (in=2037) (out=452) (deflated 78%) adding: src/qd/layers/flops_count.py (in=2350) (out=696) (deflated 70%) adding: src/qd/layers/kl_div_logit_loss.py (in=585) (out=298) (deflated 49%) adding: src/qd/layers/resnet_vl.py (in=16876) (out=3354) (deflated 80%) adding: src/qd/layers/shufflenet.py (in=3044) (out=733) (deflated 76%) adding: src/qd/layers/efficient_det2.py (in=28346) (out=6864) (deflated 76%) adding: src/qd/layers/mobilenetv3.py (in=9014) (out=2217) (deflated 75%) adding: src/qd/layers/non_local_net/ (in=0) (out=0) (stored 0%) adding: src/qd/layers/non_local_net/readme.txt (in=46) (out=46) (stored 0%) adding: src/qd/layers/non_local_net/non_local_gaussian.py (in=4915) (out=1040) (deflated 79%) adding: src/qd/layers/non_local_net/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/layers/non_local_net/non_local_concatenation.py (in=5512) (out=1154) (deflated 79%) adding: src/qd/layers/non_local_net/non_local_dot_product.py (in=5087) (out=1020) (deflated 80%) adding: src/qd/layers/non_local_net/non_local_embedded_gaussian.py (in=4597) (out=910) (deflated 80%) adding: src/qd/layers/softmaxtree.py (in=682) (out=273) (deflated 60%) adding: src/qd/layers/image_text_align.py (in=12435) (out=2801) (deflated 77%) adding: src/qd/layers/forward_pass_time_checker.py (in=2282) (out=745) (deflated 67%) adding: src/qd/layers/yolov5.py (in=16886) (out=5564) (deflated 67%) adding: src/qd/layers/boxlist_nms.py (in=6241) (out=1142) (deflated 82%) adding: src/qd/layers/forward_pass_feature_cache.py (in=3209) (out=943) (deflated 71%) adding: src/qd/layers/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/layers/resnet.py (in=14799) (out=2870) (deflated 81%) adding: src/qd/layers/mitorch_models/ (in=0) (out=0) (stored 0%) adding: src/qd/layers/mitorch_models/efficientnet.py (in=3541) (out=995) (deflated 72%) adding: src/qd/layers/mitorch_models/shufflenet.py (in=4311) (out=1059) (deflated 75%) adding: src/qd/layers/mitorch_models/resnext.py (in=3382) (out=919) (deflated 73%) adding: src/qd/layers/mitorch_models/mobilenetv3.py (in=4249) (out=727) (deflated 83%) adding: src/qd/layers/mitorch_models/mobilenetv2.py (in=3323) (out=758) (deflated 77%) adding: src/qd/layers/mitorch_models/vgg.py (in=9394) (out=719) (deflated 92%) adding: src/qd/layers/mitorch_models/ssd_lite.py (in=1661) (out=548) (deflated 67%) adding: src/qd/layers/mitorch_models/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/layers/mitorch_models/__init__.py (in=923) (out=330) (deflated 64%) adding: src/qd/layers/mitorch_models/factory.py (in=11197) (out=1199) (deflated 89%) adding: src/qd/layers/mitorch_models/feature_pyramid_network.py (in=7059) (out=1041) (deflated 85%) adding: src/qd/layers/mitorch_models/model.py (in=2380) (out=863) (deflated 64%) adding: src/qd/layers/mitorch_models/modules/ (in=0) (out=0) (stored 0%) adding: src/qd/layers/mitorch_models/modules/base.py (in=144) (out=99) (deflated 31%) adding: src/qd/layers/mitorch_models/modules/convolution.py (in=2309) (out=606) (deflated 74%) adding: src/qd/layers/mitorch_models/modules/prior_box.py (in=2162) (out=714) (deflated 67%) adding: src/qd/layers/mitorch_models/modules/mbconv.py (in=1312) (out=454) (deflated 65%) adding: src/qd/layers/mitorch_models/modules/shuffle.py (in=497) (out=232) (deflated 53%) adding: src/qd/layers/mitorch_models/modules/se_block.py (in=969) (out=400) (deflated 59%) adding: src/qd/layers/mitorch_models/modules/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/layers/mitorch_models/modules/retina_prior_box.py (in=913) (out=363) (deflated 60%) adding: src/qd/layers/mitorch_models/modules/__init__.py (in=497) (out=219) (deflated 56%) adding: src/qd/layers/mitorch_models/modules/focal_loss.py (in=2816) (out=809) (deflated 71%) adding: src/qd/layers/mitorch_models/modules/retina_predictor.py (in=460) (out=205) (deflated 55%) adding: src/qd/layers/mitorch_models/modules/ssd_predictor.py (in=3127) (out=950) (deflated 70%) adding: src/qd/layers/mitorch_models/modules/activation.py (in=387) (out=150) (deflated 61%) adding: src/qd/layers/mitorch_models/modules/ssd_loss.py (in=9847) (out=2659) (deflated 73%) adding: src/qd/layers/mitorch_models/modules/addition.py (in=237) (out=139) (deflated 41%) adding: src/qd/layers/mitorch_models/modules/linear.py (in=346) (out=174) (deflated 50%) adding: src/qd/layers/mitorch_models/modules/non_max_suppression.py (in=3545) (out=1048) (deflated 70%) adding: src/qd/layers/mitorch_models/classifier.py (in=354) (out=173) (deflated 51%) adding: src/qd/layers/mitorch_models/seresnext.py (in=2054) (out=529) (deflated 74%) adding: src/qd/layers/mitorch_models/bidirectional_feature_pyramid_network.py (in=5167) (out=1227) (deflated 76%) adding: src/qd/layers/mitorch_models/ssdlite_extra_layers.py (in=3847) (out=782) (deflated 80%) adding: src/qd/layers/mitorch_models/squeezenet.py (in=1933) (out=550) (deflated 72%) adding: src/qd/layers/mitorch_models/shufflenetv2.py (in=4143) (out=986) (deflated 76%) adding: src/qd/layers/mitorch_models/retinanet.py (in=2252) (out=675) (deflated 70%) adding: src/qd/layers/__init__.py (in=338) (out=160) (deflated 53%) adding: src/qd/layers/ssfpn.py (in=5639) (out=1751) (deflated 69%) adding: src/qd/layers/batch_norm.py (in=8147) (out=2098) (deflated 74%) adding: src/qd/layers/loss.py (in=22102) (out=4109) (deflated 81%) adding: src/qd/layers/group_batch_norm.py (in=1550) (out=485) (deflated 69%) adding: src/qd/layers/adapt_avg_pool2d.py (in=620) (out=291) (deflated 53%) adding: src/qd/layers/forward_pass_memory_checker.py (in=2338) (out=774) (deflated 67%) adding: src/qd/layers/forward_image_model.py (in=225) (out=138) (deflated 39%) adding: src/qd/layers/create_layer.py (in=142) (out=107) (deflated 25%) adding: src/qd/layers/efficient_det.py (in=93199) (out=18170) (deflated 81%) adding: src/qd/layers/smooth_l1_loss.py (in=1079) (out=418) (deflated 61%) adding: src/qd/layers/standarized_conv.py (in=1539) (out=472) (deflated 69%) adding: src/qd/layers/tensor_queue.py (in=1165) (out=471) (deflated 60%) adding: src/qd/layers/feature_extract.py (in=2216) (out=641) (deflated 71%) adding: src/qd/layers/merge_batch_norm.py (in=4277) (out=1302) (deflated 70%) adding: src/qd/layers/ntxent_loss.py (in=15841) (out=3249) (deflated 79%) adding: src/qd/layers/precise_bn.py (in=4286) (out=722) (deflated 83%) adding: src/qd/process_tsv.py (in=315680) (out=60053) (deflated 81%) adding: src/qd/compile/ (in=0) (out=0) (stored 0%) adding: src/qd/compile/gcc_ignore.py (in=2131) (out=956) (deflated 55%) adding: src/qd/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/prep_dataset/ (in=0) (out=0) (stored 0%) adding: src/qd/prep_dataset/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/prep_dataset/vlp_version.py (in=1791) (out=714) (deflated 60%) adding: src/qd/prep_dataset/open_image_v5c.py (in=14437) (out=3417) (deflated 76%) adding: src/qd/prep_dataset/wider_face.py (in=1996) (out=765) (deflated 62%) adding: src/qd/prep_dataset/prep_coco_2017.py (in=6367) (out=1565) (deflated 75%) adding: src/qd/prep_dataset/clean_label.py (in=7181) (out=1860) (deflated 74%) adding: src/qd/prep_dataset/vizwiz.py (in=4752) (out=991) (deflated 79%) adding: src/qd/prep_dataset/open_image_v6_det.py (in=14745) (out=3485) (deflated 76%) adding: src/qd/prep_dataset/build_tax_data.py (in=35401) (out=3365) (deflated 90%) adding: src/qd/gpu_util.py (in=3780) (out=1082) (deflated 71%) adding: src/qd/pipeline_runner.py (in=7501) (out=1882) (deflated 75%) adding: src/qd/__init__.py (in=0) (out=0) (stored 0%) adding: src/qd/tsv_io.py (in=50610) (out=11013) (deflated 78%) adding: src/qd/data_layer/ (in=0) (out=0) (stored 0%) adding: src/qd/data_layer/batch_kmeans.py (in=6645) (out=1870) (deflated 72%) adding: src/qd/data_layer/dataset.py (in=9251) (out=1937) (deflated 79%) adding: src/qd/data_layer/transform.py (in=75960) (out=14363) (deflated 81%) adding: src/qd/data_layer/autoaugmentation.py (in=11211) (out=1846) (deflated 84%) adding: src/qd/data_layer/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/data_layer/builder.py (in=14785) (out=3166) (deflated 79%) adding: src/qd/data_layer/samplers.py (in=7262) (out=2073) (deflated 71%) adding: src/qd/data_layer/rand_augmentation.py (in=29487) (out=6586) (deflated 78%) adding: src/qd/data_layer/loader.py (in=871) (out=285) (deflated 67%) adding: src/qd/remote_run.py (in=6483) (out=1689) (deflated 74%) adding: src/qd/deteval.py (in=26011) (out=6577) (deflated 75%) adding: src/qd/logger.py (in=3511) (out=976) (deflated 72%) adding: src/qd/pytablemd.py (in=3412) (out=1223) (deflated 64%) adding: src/qd/examples/ (in=0) (out=0) (stored 0%) adding: src/qd/examples/efficient_det0.py (in=2143) (out=900) (deflated 58%) adding: src/qd/opt/ (in=0) (out=0) (stored 0%) adding: src/qd/opt/checkpoint.py (in=14958) (out=4595) (deflated 69%) adding: src/qd/opt/sampler.py (in=3528) (out=967) (deflated 73%) adding: src/qd/opt/sgd.py (in=832) (out=337) (deflated 59%) adding: src/qd/opt/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/opt/__init__.py (in=71) (out=61) (deflated 14%) adding: src/qd/opt/ema_optimizer.py (in=2790) (out=882) (deflated 68%) adding: src/qd/opt/WarmupCosineAnnealingLR.py (in=2418) (out=653) (deflated 73%) adding: src/qd/opt/trainer.py (in=29302) (out=5036) (deflated 83%) adding: src/qd/gpucluster/ (in=0) (out=0) (stored 0%) adding: src/qd/gpucluster/aml_client.py (in=52390) (out=11691) (deflated 78%) adding: src/qd/gpucluster/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/gpucluster/__init__.py (in=123) (out=74) (deflated 40%) adding: src/qd/gpucluster/philly_client.py (in=59677) (out=14947) (deflated 75%) adding: src/qd/gpucluster/aml_server.py (in=10236) (out=3472) (deflated 66%) adding: src/qd/gpucluster/philly_server.py (in=8651) (out=2962) (deflated 66%) adding: src/qd/gpucluster/README.md (in=20017) (out=6882) (deflated 66%) adding: src/qd/gpucluster/aux_data (in=37) (out=37) (stored 0%) adding: src/qd/qd_caffe.py (in=23661) (out=5459) (deflated 77%) adding: src/qd/cloud_storage.py (in=25121) (out=5551) (deflated 78%) adding: src/qd/pipelines/ (in=0) (out=0) (stored 0%) adding: src/qd/pipelines/test (in=40326) (out=8930) (deflated 78%) adding: src/qd/pipelines/caption_uni_pipeline.py (in=28442) (out=6659) (deflated 77%) adding: src/qd/pipelines/multi_scale/ (in=0) (out=0) (stored 0%) adding: src/qd/pipelines/multi_scale/multi_scale_caption_uni_pipeline_multi_tower.py (in=27974) (out=6487) (deflated 77%) adding: src/qd/pipelines/multi_scale/multi_scale_caption_uni_pipeline_two_tower.py (in=27387) (out=6467) (deflated 76%) adding: src/qd/pipelines/multi_scale/multi_scale_vlp_uni_pipeline_jf.py (in=23762) (out=4858) (deflated 80%) adding: src/qd/pipelines/multi_scale/multi_scale_caption_uni_pipeline_token_drop.py (in=26366) (out=6310) (deflated 76%) adding: src/qd/pipelines/multi_scale/multi_scale_caption_uni_pipeline_encdec_nocaps.py (in=38404) (out=8847) (deflated 77%) adding: src/qd/pipelines/multi_scale/multi_scale_caption_uni_pipeline_recurrent_training.py (in=33081) (out=7831) (deflated 76%) adding: src/qd/pipelines/multi_scale/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/pipelines/multi_scale/others/ (in=0) (out=0) (stored 0%) adding: src/qd/pipelines/multi_scale/others/multi_scale_vqa_uni_pipeline.py (in=24525) (out=6219) (deflated 75%) adding: src/qd/pipelines/multi_scale/multi_scale_caption_uni_pipeline_mutual_tower.py (in=27100) (out=6449) (deflated 76%) adding: src/qd/pipelines/multi_scale/multi_scale_caption_uni_pipeline_attention_select.py (in=40770) (out=8410) (deflated 79%) adding: src/qd/pipelines/multi_scale/multi_scale_caption_uni_pipeline_clip.py (in=26088) (out=6190) (deflated 76%) adding: src/qd/pipelines/multi_scale/multi_scale_caption_uni_pipeline.py (in=42101) (out=9200) (deflated 78%) adding: src/qd/pipelines/multi_scale/multi_scale_caption_uni_pipeline_encdec.py (in=43151) (out=9675) (deflated 78%) adding: src/qd/pipelines/multi_scale/caption_uni_pipeline_bbox.py (in=60683) (out=11922) (deflated 80%) adding: src/qd/pipelines/ViT_tagger_uni_pipeline_vis.py (in=42590) (out=10051) (deflated 76%) adding: src/qd/pipelines/tagger_caption_uni_pipeline_expanding_bertemb.py (in=47335) (out=11193) (deflated 76%) adding: src/qd/pipelines/tagger_caption_uni_pipeline.py (in=46538) (out=11217) (deflated 76%) adding: src/qd/pipelines/ViT_tagger_uni_pipeline.py (in=31661) (out=8025) (deflated 75%) adding: src/qd/pipelines/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/pipelines/distillation/ (in=0) (out=0) (stored 0%) adding: src/qd/pipelines/distillation/logit_distill_caption_uni_pipeline.py (in=31204) (out=7376) (deflated 76%) adding: src/qd/pipelines/distillation/vlp_uni_pipeline_distill_proposal.py (in=29656) (out=6867) (deflated 77%) adding: src/qd/pipelines/distillation/logit_distill_multi_scale_caption_uni_pipeline.py (in=33730) (out=7763) (deflated 77%) adding: src/qd/pipelines/distillation/vlp_uni_pipeline_distill.py (in=37235) (out=8160) (deflated 78%) adding: src/qd/pipelines/distillation/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/pipelines/distillation/multi_scale_distillation_caption_uni_pipeline_encdec.py (in=28288) (out=6543) (deflated 77%) adding: src/qd/pipelines/distillation/vlp_uni_pipeline_distill_encoder_decoder.py (in=33806) (out=7639) (deflated 77%) adding: src/qd/pipelines/distillation/vlp_uni_pipeline_distill_gumbel.py (in=36730) (out=8459) (deflated 77%) adding: src/qd/pipelines/distillation/vlp_uni_pipeline_distill_with_tags.py (in=69857) (out=13401) (deflated 81%) adding: src/qd/pipelines/distillation/correct_distill_multi_scale_caption_uni_pipeline.py (in=34062) (out=7956) (deflated 77%) adding: src/qd/pipelines/distillation/vqa_uni_pipeline_distill.py (in=25027) (out=6281) (deflated 75%) adding: src/qd/pipelines/distillation/multi_scale_distillation_caption_uni_pipeline.py (in=34318) (out=7997) (deflated 77%) adding: src/qd/pipelines/ViT_all_token_tagger_uni_pipeline.py (in=27231) (out=7251) (deflated 73%) adding: src/qd/pipelines/tagger_caption_uni_pipeline_expanding_bertemb_vis.py (in=51299) (out=11999) (deflated 77%) adding: src/qd/pipelines/tagger_caption_uni_pipeline_expanding_bertemb_distill.py (in=56629) (out=12861) (deflated 77%) adding: src/qd/pipelines/others/ (in=0) (out=0) (stored 0%) adding: src/qd/pipelines/others/sim_clr.py (in=5938) (out=1923) (deflated 68%) adding: src/qd/pipelines/others/kl_entropy_pipeline.py (in=932) (out=394) (deflated 58%) adding: src/qd/pipelines/others/tap_uni_pipeline.py (in=4162) (out=1220) (deflated 71%) adding: src/qd/pipelines/others/faster_rcnn_distill.py (in=8417) (out=2139) (deflated 75%) adding: src/qd/pipelines/others/ocpretrain.py (in=3590) (out=1315) (deflated 63%) adding: src/qd/pipelines/others/moco_distill.py (in=32695) (out=7145) (deflated 78%) adding: src/qd/pipelines/others/checkpoint_zero_pipeline.py (in=25473) (out=6271) (deflated 75%) adding: src/qd/pipelines/others/simple_vl.py (in=3368) (out=1136) (deflated 66%) adding: src/qd/pipelines/others/efficient_det_distill.py (in=6117) (out=1576) (deflated 74%) adding: src/qd/pipelines/others/classification_for_maskrcnn.py (in=2602) (out=972) (deflated 63%) adding: src/qd/pipelines/others/m4c_tap.py (in=13809) (out=3456) (deflated 75%) adding: src/qd/pipelines/others/mmask_pretrain.py (in=20094) (out=5207) (deflated 74%) adding: src/qd/pipelines/others/image_text_retrieval.py (in=32481) (out=7683) (deflated 76%) adding: src/qd/pipelines/others/auto_param.py (in=38878) (out=7254) (deflated 81%) adding: src/qd/pipelines/others/multi_scale_vlp_uni_pipeline.py (in=16066) (out=4140) (deflated 74%) adding: src/qd/pipelines/others/yolo_by_mask.py (in=32567) (out=7973) (deflated 76%) adding: src/qd/pipelines/others/vqa.py (in=29396) (out=7593) (deflated 74%) adding: src/qd/pipelines/others/tagger_caption_uni_pipeline_expanding_bertemb_gradient.py (in=45852) (out=10864) (deflated 76%) adding: src/qd/pipelines/others/yolov5.py (in=73599) (out=22919) (deflated 69%) adding: src/qd/pipelines/others/soft_balanced.py (in=40025) (out=7721) (deflated 81%) adding: src/qd/pipelines/others/s4_pipeline.py (in=9581) (out=2696) (deflated 72%) adding: src/qd/pipelines/others/qd_mmdetection.py (in=10131) (out=2816) (deflated 72%) adding: src/qd/pipelines/others/late_fusion_caption_uni_pipeline.py (in=6150) (out=1754) (deflated 71%) adding: src/qd/pipelines/others/cluster_fit.py (in=1531) (out=579) (deflated 62%) adding: src/qd/pipelines/others/fb_swav.py (in=32879) (out=8783) (deflated 73%) adding: src/qd/pipelines/others/triplet_contrastive_pipeline.py (in=10795) (out=3472) (deflated 68%) adding: src/qd/pipelines/others/clip_uni_pipeline.py (in=22206) (out=5015) (deflated 77%) adding: src/qd/pipelines/others/e2e_caption.py (in=44482) (out=9419) (deflated 79%) adding: src/qd/pipelines/others/fcos.py (in=22424) (out=5299) (deflated 76%) adding: src/qd/pipelines/others/mm_detect.py (in=8232) (out=2473) (deflated 70%) adding: src/qd/pipelines/others/heatmap_score_box.py (in=8226) (out=2295) (deflated 72%) adding: src/qd/pipelines/others/fb_moco.py (in=52221) (out=10031) (deflated 81%) adding: src/qd/pipelines/others/classification_by_maskrcnn.py (in=21947) (out=5030) (deflated 77%) adding: src/qd/pipelines/others/detectron2.py (in=17876) (out=5154) (deflated 71%) adding: src/qd/pipelines/others/pipeline_base.py (in=167) (out=106) (deflated 37%) adding: src/qd/pipelines/others/slow_contrast.py (in=19457) (out=4703) (deflated 76%) adding: src/qd/pipelines/others/fast_human_det.py (in=10675) (out=2903) (deflated 73%) adding: src/qd/pipelines/others/usl_cmc.py (in=14134) (out=3350) (deflated 76%) adding: src/qd/pipelines/others/clip.py (in=32519) (out=7938) (deflated 76%) adding: src/qd/pipelines/others/cls_feature_extract_uni_pipeline.py (in=3016) (out=1032) (deflated 66%) adding: src/qd/pipelines/others/vqa_uni_pipeline.py (in=19771) (out=5266) (deflated 73%) adding: src/qd/pipelines/others/knn_classifier.py (in=3450) (out=1109) (deflated 68%) adding: src/qd/pipelines/others/extract_spatial_before_avgpool.py (in=774) (out=332) (deflated 57%) adding: src/qd/pipelines/others/mmask.py (in=13738) (out=3557) (deflated 74%) adding: src/qd/pipelines/others/efficient_det_pipeline.py (in=22394) (out=5094) (deflated 77%) adding: src/qd/pipelines/others/reppoint.py (in=7625) (out=2334) (deflated 69%) adding: src/qd/pipelines/others/caption_uni_pipeline_distill.py (in=30947) (out=7392) (deflated 76%) adding: src/qd/pipelines/others/det_clip_uni_pipeline.py (in=530) (out=231) (deflated 56%) adding: src/qd/pipelines/others/cls_uni_pipeline.py (in=3408) (out=1086) (deflated 68%) adding: src/qd/pipelines/others/mmask_caption.py (in=45160) (out=9586) (deflated 79%) adding: src/qd/pipelines/others/distill_caption_uni_pipeline.py (in=18506) (out=4059) (deflated 78%) adding: src/qd/pipelines/others/yolov2_pt.py (in=6564) (out=1805) (deflated 73%) adding: src/qd/pipelines/others/contrastive_vlp_uni_pipeline.py (in=10281) (out=2663) (deflated 74%) adding: src/qd/pipelines/others/vlp_uni_pipeline.py (in=9724) (out=2366) (deflated 76%) adding: src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py (in=53559) (out=12387) (deflated 77%) adding: src/qd/pipelines/Kim/ (in=0) (out=0) (stored 0%) adding: src/qd/pipelines/Kim/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/pipelines/Kim/kim_vilt_caption_uni_pipeline.py (in=29311) (out=7167) (deflated 76%) adding: src/qd/pipelines/Kim/kim_vqa_uni_pipeline.py (in=24356) (out=6448) (deflated 74%) adding: src/qd/pipelines/Kim/kim_vqa_logit_distill_uni_pipeline.py (in=28242) (out=7251) (deflated 74%) adding: src/qd/pipelines/vqa_uni_pipeline.py (in=19811) (out=5272) (deflated 73%) adding: src/qd/pipelines/uni_pipeline.py (in=81347) (out=16646) (deflated 80%) adding: src/qd/pipelines/vlp_uni_pipeline.py (in=12680) (out=3056) (deflated 76%) adding: src/qd/latex_writer.py (in=7828) (out=1846) (deflated 76%) adding: src/qd/hnms.py (in=18222) (out=3055) (deflated 83%) adding: src/qd/process_dataset.py (in=11742) (out=2804) (deflated 76%) adding: src/qd/project/ (in=0) (out=0) (stored 0%) adding: src/qd/project/text_aware_pre_training.py (in=66226) (out=11617) (deflated 82%) adding: src/qd/project/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/qd/project/semi_weak_pretrain.py (in=97677) (out=16466) (deflated 83%) adding: src/qd/project/general_vision_language.py (in=25055) (out=6017) (deflated 76%) adding: src/qd/garbage_collector.py (in=4639) (out=1269) (deflated 73%) adding: src/qd/demo_detection.py (in=14115) (out=3641) (deflated 74%) adding: src/qd/qd_common.py (in=123355) (out=30491) (deflated 75%) adding: src/pytorch_image_models/ (in=0) (out=0) (stored 0%) adding: src/pytorch_image_models/timm/ (in=0) (out=0) (stored 0%) adding: src/pytorch_image_models/timm/optim/ (in=0) (out=0) (stored 0%) adding: src/pytorch_image_models/timm/optim/radam.py (in=5924) (out=1129) (deflated 81%) adding: src/pytorch_image_models/timm/optim/optim_factory.py (in=4764) (out=1221) (deflated 74%) adding: src/pytorch_image_models/timm/optim/adahessian.py (in=6535) (out=2197) (deflated 66%) adding: src/pytorch_image_models/timm/optim/nvnovograd.py (in=4795) (out=1605) (deflated 67%) adding: src/pytorch_image_models/timm/optim/adamp.py (in=3689) (out=1261) (deflated 66%) adding: src/pytorch_image_models/timm/optim/lookahead.py (in=3815) (out=1185) (deflated 69%) adding: src/pytorch_image_models/timm/optim/__init__.py (in=368) (out=158) (deflated 57%) adding: src/pytorch_image_models/timm/optim/sgdp.py (in=3231) (out=1160) (deflated 64%) adding: src/pytorch_image_models/timm/optim/novograd.py (in=2925) (out=943) (deflated 68%) adding: src/pytorch_image_models/timm/optim/adafactor.py (in=8126) (out=2354) (deflated 71%) adding: src/pytorch_image_models/timm/optim/rmsprop_tf.py (in=6127) (out=2017) (deflated 67%) adding: src/pytorch_image_models/timm/optim/adamw.py (in=4965) (out=1603) (deflated 68%) adding: src/pytorch_image_models/timm/optim/nadam.py (in=3758) (out=1304) (deflated 65%) adding: src/pytorch_image_models/timm/scheduler/ (in=0) (out=0) (stored 0%) adding: src/pytorch_image_models/timm/scheduler/cosine_lr.py (in=3977) (out=1196) (deflated 70%) adding: src/pytorch_image_models/timm/scheduler/scheduler.py (in=4750) (out=1467) (deflated 69%) adding: src/pytorch_image_models/timm/scheduler/plateau_lr.py (in=4140) (out=1274) (deflated 69%) adding: src/pytorch_image_models/timm/scheduler/tanh_lr.py (in=4045) (out=1157) (deflated 71%) adding: src/pytorch_image_models/timm/scheduler/__init__.py (in=206) (out=100) (deflated 51%) adding: src/pytorch_image_models/timm/scheduler/step_lr.py (in=1902) (out=589) (deflated 69%) adding: src/pytorch_image_models/timm/scheduler/scheduler_factory.py (in=3268) (out=654) (deflated 80%) adding: src/pytorch_image_models/timm/loss/ (in=0) (out=0) (stored 0%) adding: src/pytorch_image_models/timm/loss/jsd.py (in=1595) (out=747) (deflated 53%) adding: src/pytorch_image_models/timm/loss/cross_entropy.py (in=1082) (out=393) (deflated 64%) adding: src/pytorch_image_models/timm/loss/__init__.py (in=191) (out=116) (deflated 39%) adding: src/pytorch_image_models/timm/loss/asymmetric_loss.py (in=3322) (out=1001) (deflated 70%) adding: src/pytorch_image_models/timm/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/pytorch_image_models/timm/__init__.py (in=189) (out=108) (deflated 43%) adding: src/pytorch_image_models/timm/data/ (in=0) (out=0) (stored 0%) adding: src/pytorch_image_models/timm/data/mixup.py (in=14711) (out=3297) (deflated 78%) adding: src/pytorch_image_models/timm/data/config.py (in=2756) (out=782) (deflated 72%) adding: src/pytorch_image_models/timm/data/parsers/ (in=0) (out=0) (stored 0%) adding: src/pytorch_image_models/timm/data/parsers/parser_tfds.py (in=10443) (out=3703) (deflated 65%) adding: src/pytorch_image_models/timm/data/parsers/constants.py (in=43) (out=38) (deflated 12%) adding: src/pytorch_image_models/timm/data/parsers/parser_factory.py (in=1116) (out=480) (deflated 57%) adding: src/pytorch_image_models/timm/data/parsers/parser_image_tar.py (in=2589) (out=1005) (deflated 61%) adding: src/pytorch_image_models/timm/data/parsers/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/pytorch_image_models/timm/data/parsers/__init__.py (in=42) (out=40) (deflated 5%) adding: src/pytorch_image_models/timm/data/parsers/class_map.py (in=571) (out=269) (deflated 53%) adding: src/pytorch_image_models/timm/data/parsers/parser_image_folder.py (in=2508) (out=983) (deflated 61%) adding: src/pytorch_image_models/timm/data/parsers/parser_image_in_tar.py (in=8987) (out=2855) (deflated 68%) adding: src/pytorch_image_models/timm/data/parsers/parser.py (in=487) (out=174) (deflated 64%) adding: src/pytorch_image_models/timm/data/dataset.py (in=4506) (out=1302) (deflated 71%) adding: src/pytorch_image_models/timm/data/dataset_factory.py (in=1057) (out=433) (deflated 59%) adding: src/pytorch_image_models/timm/data/constants.py (in=303) (out=153) (deflated 50%) adding: src/pytorch_image_models/timm/data/tf_preprocessing.py (in=9120) (out=2622) (deflated 71%) adding: src/pytorch_image_models/timm/data/distributed_sampler.py (in=1955) (out=720) (deflated 63%) adding: src/pytorch_image_models/timm/data/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/pytorch_image_models/timm/data/__init__.py (in=553) (out=226) (deflated 59%) adding: src/pytorch_image_models/timm/data/transforms_factory.py (in=8262) (out=2049) (deflated 75%) adding: src/pytorch_image_models/timm/data/loader.py (in=8732) (out=2441) (deflated 72%) adding: src/pytorch_image_models/timm/data/real_labels.py (in=1590) (out=677) (deflated 57%) adding: src/pytorch_image_models/timm/data/random_erasing.py (in=4512) (out=1620) (deflated 64%) adding: src/pytorch_image_models/timm/data/transforms.py (in=5328) (out=1626) (deflated 69%) adding: src/pytorch_image_models/timm/data/auto_augment.py (in=29504) (out=6595) (deflated 78%) adding: src/pytorch_image_models/timm/models/ (in=0) (out=0) (stored 0%) adding: src/pytorch_image_models/timm/models/selecsls.py (in=13100) (out=3099) (deflated 76%) adding: src/pytorch_image_models/timm/models/resnetv2.py (in=23800) (out=5365) (deflated 77%) adding: src/pytorch_image_models/timm/models/efficientnet.py (in=71184) (out=8131) (deflated 89%) adding: src/pytorch_image_models/timm/models/inception_resnet_v2.py (in=12318) (out=2211) (deflated 82%) adding: src/pytorch_image_models/timm/models/rexnet.py (in=9972) (out=2841) (deflated 72%) adding: src/pytorch_image_models/timm/models/hrnet.py (in=29301) (out=5030) (deflated 83%) adding: src/pytorch_image_models/timm/models/cspnet.py (in=17904) (out=4226) (deflated 76%) adding: src/pytorch_image_models/timm/models/t2t_vit/ (in=0) (out=0) (stored 0%) adding: src/pytorch_image_models/timm/models/t2t_vit/transformer_block.py (in=3286) (out=1144) (deflated 65%) adding: src/pytorch_image_models/timm/models/t2t_vit/token_performer.py (in=1129) (out=527) (deflated 53%) adding: src/pytorch_image_models/timm/models/t2t_vit/t2t_vit_dense.py (in=6736) (out=2132) (deflated 68%) adding: src/pytorch_image_models/timm/models/t2t_vit/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/pytorch_image_models/timm/models/t2t_vit/t2t_vit.py (in=12471) (out=2496) (deflated 80%) adding: src/pytorch_image_models/timm/models/t2t_vit/__init__.py (in=310) (out=195) (deflated 37%) adding: src/pytorch_image_models/timm/models/t2t_vit/t2t_vit_ghost.py (in=7737) (out=2148) (deflated 72%) adding: src/pytorch_image_models/timm/models/t2t_vit/t2t_vit_se.py (in=6305) (out=2029) (deflated 68%) adding: src/pytorch_image_models/timm/models/t2t_vit/token_transformer.py (in=2326) (out=917) (deflated 61%) adding: src/pytorch_image_models/timm/models/mobilenetv3.py (in=17607) (out=3544) (deflated 80%) adding: src/pytorch_image_models/timm/models/resnest.py (in=10194) (out=2170) (deflated 79%) adding: src/pytorch_image_models/timm/models/gluon_xception.py (in=9530) (out=2523) (deflated 74%) adding: src/pytorch_image_models/timm/models/vision_transformer.py (in=67293) (out=11252) (deflated 83%) adding: src/pytorch_image_models/timm/models/efficientnet_builder.py (in=17482) (out=4957) (deflated 72%) adding: src/pytorch_image_models/timm/models/nfnet.py (in=20612) (out=5303) (deflated 74%) adding: src/pytorch_image_models/timm/models/layers/ (in=0) (out=0) (stored 0%) adding: src/pytorch_image_models/timm/models/layers/config.py (in=3069) (out=801) (deflated 74%) adding: src/pytorch_image_models/timm/models/layers/padding.py (in=2167) (out=864) (deflated 60%) adding: src/pytorch_image_models/timm/models/layers/norm_act.py (in=3542) (out=1147) (deflated 68%) adding: src/pytorch_image_models/timm/models/layers/inplace_abn.py (in=3353) (out=1132) (deflated 66%) adding: src/pytorch_image_models/timm/models/layers/pool2d_same.py (in=2969) (out=801) (deflated 73%) adding: src/pytorch_image_models/timm/models/layers/median_pool.py (in=1737) (out=662) (deflated 62%) adding: src/pytorch_image_models/timm/models/layers/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/pytorch_image_models/timm/models/layers/mixed_conv2d.py (in=1844) (out=786) (deflated 57%) adding: src/pytorch_image_models/timm/models/layers/cbam.py (in=3337) (out=944) (deflated 72%) adding: src/pytorch_image_models/timm/models/layers/activations_me.py (in=5886) (out=1439) (deflated 76%) adding: src/pytorch_image_models/timm/models/layers/__init__.py (in=1767) (out=648) (deflated 63%) adding: src/pytorch_image_models/timm/models/layers/blur_pool.py (in=2180) (out=966) (deflated 56%) adding: src/pytorch_image_models/timm/models/layers/std_conv.py (in=3920) (out=971) (deflated 75%) adding: src/pytorch_image_models/timm/models/layers/split_batchnorm.py (in=3441) (out=1216) (deflated 65%) adding: src/pytorch_image_models/timm/models/layers/test_time_pool.py (in=1851) (out=708) (deflated 62%) adding: src/pytorch_image_models/timm/models/layers/activations.py (in=4040) (out=1104) (deflated 73%) adding: src/pytorch_image_models/timm/models/layers/selective_kernel.py (in=5282) (out=1716) (deflated 68%) adding: src/pytorch_image_models/timm/models/layers/create_conv2d.py (in=1399) (out=591) (deflated 58%) adding: src/pytorch_image_models/timm/models/layers/split_attn.py (in=3013) (out=1008) (deflated 67%) adding: src/pytorch_image_models/timm/models/layers/create_attn.py (in=1418) (out=461) (deflated 67%) adding: src/pytorch_image_models/timm/models/layers/weight_init.py (in=2359) (out=1003) (deflated 57%) adding: src/pytorch_image_models/timm/models/layers/evo_norm.py (in=3328) (out=981) (deflated 71%) adding: src/pytorch_image_models/timm/models/layers/adaptive_avgmax_pool.py (in=3903) (out=1008) (deflated 74%) adding: src/pytorch_image_models/timm/models/layers/eca.py (in=4701) (out=1848) (deflated 61%) adding: src/pytorch_image_models/timm/models/layers/create_act.py (in=3904) (out=1084) (deflated 72%) adding: src/pytorch_image_models/timm/models/layers/helpers.py (in=738) (out=382) (deflated 48%) adding: src/pytorch_image_models/timm/models/layers/classifier.py (in=2300) (out=784) (deflated 66%) adding: src/pytorch_image_models/timm/models/layers/create_norm_act.py (in=3327) (out=1199) (deflated 64%) adding: src/pytorch_image_models/timm/models/layers/space_to_depth.py (in=1750) (out=490) (deflated 72%) adding: src/pytorch_image_models/timm/models/layers/drop.py (in=6938) (out=2062) (deflated 70%) adding: src/pytorch_image_models/timm/models/layers/activations_jit.py (in=2529) (out=877) (deflated 65%) adding: src/pytorch_image_models/timm/models/layers/conv_bn_act.py (in=1466) (out=585) (deflated 60%) adding: src/pytorch_image_models/timm/models/layers/linear.py (in=743) (out=380) (deflated 49%) adding: src/pytorch_image_models/timm/models/layers/conv2d_same.py (in=1490) (out=596) (deflated 60%) adding: src/pytorch_image_models/timm/models/layers/separable_conv.py (in=2641) (out=717) (deflated 73%) adding: src/pytorch_image_models/timm/models/layers/anti_aliasing.py (in=2293) (out=708) (deflated 69%) adding: src/pytorch_image_models/timm/models/layers/cond_conv2d.py (in=5129) (out=1597) (deflated 69%) adding: src/pytorch_image_models/timm/models/layers/se.py (in=2294) (out=761) (deflated 67%) adding: src/pytorch_image_models/timm/models/senet.py (in=17637) (out=3473) (deflated 80%) adding: src/pytorch_image_models/timm/models/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/pytorch_image_models/timm/models/resnet.py (in=58768) (out=8319) (deflated 86%) adding: src/pytorch_image_models/timm/models/dpn.py (in=12328) (out=2859) (deflated 77%) adding: src/pytorch_image_models/timm/models/__init__.py (in=1064) (out=310) (deflated 71%) adding: src/pytorch_image_models/timm/models/dla.py (in=17155) (out=3586) (deflated 79%) adding: src/pytorch_image_models/timm/models/factory.py (in=2768) (out=1100) (deflated 60%) adding: src/pytorch_image_models/timm/models/hub.py (in=3409) (out=1326) (deflated 61%) adding: src/pytorch_image_models/timm/models/densenet.py (in=15595) (out=3709) (deflated 76%) adding: src/pytorch_image_models/timm/models/efficientnet_blocks.py (in=14680) (out=3084) (deflated 79%) adding: src/pytorch_image_models/timm/models/res2net.py (in=7849) (out=1919) (deflated 76%) adding: src/pytorch_image_models/timm/models/regnet.py (in=20529) (out=4873) (deflated 76%) adding: src/pytorch_image_models/timm/models/gluon_resnet.py (in=11348) (out=1489) (deflated 87%) adding: src/pytorch_image_models/timm/models/helpers.py (in=20989) (out=5413) (deflated 74%) adding: src/pytorch_image_models/timm/models/pnasnet.py (in=14839) (out=2860) (deflated 81%) adding: src/pytorch_image_models/timm/models/nasnet.py (in=25683) (out=3028) (deflated 88%) adding: src/pytorch_image_models/timm/models/vovnet.py (in=13821) (out=3150) (deflated 77%) adding: src/pytorch_image_models/timm/models/xception_aligned.py (in=9266) (out=2287) (deflated 75%) adding: src/pytorch_image_models/timm/models/pruned/ (in=0) (out=0) (stored 0%) adding: src/pytorch_image_models/timm/models/pruned/ecaresnet101d_pruned.txt (in=8734) (out=1311) (deflated 85%) adding: src/pytorch_image_models/timm/models/pruned/efficientnet_b2_pruned.txt (in=18676) (out=2208) (deflated 88%) adding: src/pytorch_image_models/timm/models/pruned/efficientnet_b3_pruned.txt (in=21133) (out=2476) (deflated 88%) adding: src/pytorch_image_models/timm/models/pruned/ecaresnet50d_pruned.txt (in=4520) (out=756) (deflated 83%) adding: src/pytorch_image_models/timm/models/pruned/efficientnet_b1_pruned.txt (in=18596) (out=2208) (deflated 88%) adding: src/pytorch_image_models/timm/models/sknet.py (in=8709) (out=1966) (deflated 77%) adding: src/pytorch_image_models/timm/models/features.py (in=12155) (out=3574) (deflated 71%) adding: src/pytorch_image_models/timm/models/registry.py (in=3970) (out=1351) (deflated 66%) adding: src/pytorch_image_models/timm/models/xception.py (in=7372) (out=2142) (deflated 71%) adding: src/pytorch_image_models/timm/models/inception_v3.py (in=17431) (out=3180) (deflated 82%) adding: src/pytorch_image_models/timm/models/tresnet.py (in=11433) (out=2709) (deflated 76%) adding: src/pytorch_image_models/timm/models/inception_v4.py (in=10723) (out=1913) (deflated 82%) adding: src/pytorch_image_models/timm/version.py (in=22) (out=22) (stored 0%) adding: src/pytorch_image_models/timm/utils/ (in=0) (out=0) (stored 0%) adding: src/pytorch_image_models/timm/utils/summary.py (in=1074) (out=478) (deflated 55%) adding: src/pytorch_image_models/timm/utils/__pycache__/ (in=0) (out=0) (stored 0%) adding: src/pytorch_image_models/timm/utils/__init__.py (in=459) (out=242) (deflated 47%) adding: src/pytorch_image_models/timm/utils/misc.py (in=644) (out=375) (deflated 42%) adding: src/pytorch_image_models/timm/utils/jit.py (in=648) (out=361) (deflated 44%) adding: src/pytorch_image_models/timm/utils/model_ema.py (in=5670) (out=1712) (deflated 70%) adding: src/pytorch_image_models/timm/utils/distributed.py (in=896) (out=416) (deflated 54%) adding: src/pytorch_image_models/timm/utils/cuda.py (in=1616) (out=529) (deflated 67%) adding: src/pytorch_image_models/timm/utils/model.py (in=389) (out=212) (deflated 46%) adding: src/pytorch_image_models/timm/utils/metrics.py (in=867) (out=426) (deflated 51%) adding: src/pytorch_image_models/timm/utils/checkpoint_saver.py (in=6133) (out=1649) (deflated 73%) adding: src/pytorch_image_models/timm/utils/log.py (in=1015) (out=429) (deflated 58%) adding: stats.pdf (in=31668) (out=26372) (deflated 17%) adding: tools/ (in=0) (out=0) (stored 0%) adding: tools/azureml/ (in=0) (out=0) (stored 0%) adding: tools/azureml/workspace_utils.py (in=4563) (out=1197) (deflated 74%) adding: tools/azureml/aml_main.py (in=7466) (out=2398) (deflated 68%) adding: tools/azureml/aml_job.py (in=6141) (out=1984) (deflated 68%) adding: tools/azureml/__pycache__/ (in=0) (out=0) (stored 0%) adding: tools/azureml/misc.py (in=404) (out=232) (deflated 43%) adding: tools/azureml/README.md (in=4733) (out=1947) (deflated 59%) adding: tools/azureml/aml_submit.py (in=7630) (out=2642) (deflated 65%) adding: tools/azureml/aml_job_config.json (in=4437) (out=1844) (deflated 58%) adding: tools/common_utils/ (in=0) (out=0) (stored 0%) adding: tools/common_utils/RandAugment.py (in=13823) (out=3960) (deflated 71%) adding: tools/common_utils/azure_storage_io.py (in=6058) (out=1427) (deflated 76%) adding: tools/common_utils/utils.py (in=12776) (out=2937) (deflated 77%) adding: tools/common_utils/__pycache__/ (in=0) (out=0) (stored 0%) adding: tools/common_utils/misc.py (in=6657) (out=1844) (deflated 72%) adding: tools/common_utils/ignore_file.py (in=2631) (out=745) (deflated 72%) adding: tools/common_utils/RandomCrop.py (in=38062) (out=4655) (deflated 88%) adding: vinvl_label.json (in=70538) (out=24658) (deflated 65%) adding: visualize.py (in=3403) (out=1336) (deflated 61%) total bytes=55353896, compressed=15757117 -> 72% savings 2022-03-17 13:32:23,076.076 2829:qd_pytorch.py:1420 load_latest_parameters(): using output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/parameters_2022_03_16_04_43_35.yaml 2022-03-17 13:32:23,368.368 2829:uni_pipeline.py:841 _ensure_initialized(): initialized 2022-03-17 13:32:23,720.720 2829:modeling_utils.py:187 from_pretrained(): loading configuration file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/config.json 2022-03-17 13:32:23,720.720 2829:modeling_utils.py:211 from_pretrained(): Model config { "attention_probs_dropout_prob": 0.1, "finetuning_task": "image_captioning", "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 3072, "layer_norm_eps": 1e-12, "max_position_embeddings": 512, "model_type": "TIMM_vit", "net": "vit_base_patch16_384", "num_attention_heads": 12, "num_hidden_layers": 12, "num_labels": 2, "output_attentions": false, "output_hidden_states": false, "pretrained": true, "torchscript": false, "type_vocab_size": 2, "vocab_size": 30522 } 2022-03-17 13:32:23,722.722 2829:tokenization_utils.py:170 _from_pretrained(): Model name './aux_data/untrained_config/VILT-L12-H784-uncased_16_384' not found in model shortcut name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese, bert-base-german-cased, bert-large-uncased-whole-word-masking, bert-large-cased-whole-word-masking, bert-large-uncased-whole-word-masking-finetuned-squad, bert-large-cased-whole-word-masking-finetuned-squad, bert-base-cased-finetuned-mrpc). Assuming './aux_data/untrained_config/VILT-L12-H784-uncased_16_384' is a path or url to a directory containing tokenizer files. 2022-03-17 13:32:23,723.723 2829:tokenization_utils.py:180 _from_pretrained(): Didn't find file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/added_tokens.json. We won't load it. 2022-03-17 13:32:23,723.723 2829:tokenization_utils.py:180 _from_pretrained(): Didn't find file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/special_tokens_map.json. We won't load it. 2022-03-17 13:32:23,723.723 2829:tokenization_utils.py:214 _from_pretrained(): loading file None 2022-03-17 13:32:23,723.723 2829:tokenization_utils.py:214 _from_pretrained(): loading file None 2022-03-17 13:32:23,723.723 2829:tokenization_utils.py:214 _from_pretrained(): loading file ./aux_data/untrained_config/VILT-L12-H784-uncased_16_384/vocab.txt --- Logging error --- Traceback (most recent call last): File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit msg = self.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format return fmt.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format record.message = record.getMessage() File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage msg = str(self.msg) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__ return str(self.to_json_string()) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n" File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps **kw).encode(obj) File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode chunks = list(chunks) File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode yield from _iterencode_dict(o, _current_indent_level) File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict yield from chunks File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode o = _default(o) File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default raise TypeError(f'Object of type {o.__class__.__name__} ' TypeError: Object of type BertTokenizer is not JSON serializable Call stack: File "src/qd/pipeline.py", line 1368, in locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 628, in pipeline_train_eval_multi pip.ensure_predict() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict self.predict(model_file, predict_result_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict model = self.get_model(is_train=False) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. --- Logging error --- Traceback (most recent call last): File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit msg = self.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format return fmt.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format record.message = record.getMessage() File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage msg = str(self.msg) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__ return str(self.to_json_string()) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n" File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps **kw).encode(obj) File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode chunks = list(chunks) File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode yield from _iterencode_dict(o, _current_indent_level) File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict yield from chunks File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode o = _default(o) File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default raise TypeError(f'Object of type {o.__class__.__name__} ' TypeError: Object of type BertTokenizer is not JSON serializable Call stack: File "src/qd/pipeline.py", line 1368, in locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 628, in pipeline_train_eval_multi pip.ensure_predict() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict self.predict(model_file, predict_result_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict model = self.get_model(is_train=False) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. 2022-03-17 13:32:24,439.439 2829:modeling_bert.py:529 __init__(): TIMM Split image encoder load from pre-trained: True 2022-03-17 13:32:25,514.514 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:32:27,035.035 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:32:27,691.691 2829:modeling_bert.py:2677 __init__(): BertImgModel Image Dimension: 2054 2022-03-17 13:32:27,840.840 2829:modeling_bert.py:2688 __init__(): BertImgModel Image Dimension: 2054 2022-03-17 13:32:28,631.631 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight! Note that this might be replaced by pre-trained checkpoint later! 2022-03-17 13:32:28,632.632 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode. 2022-03-17 13:32:29,736.736 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:32:30,268.268 2829:checkpoint.py:240 load(): Loading checkpoint from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt 2022-03-17 13:32:38,771.771 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.cls_token loaded from image_encoder.module.cls_token of shape (1, 1, 768) 2022-03-17 13:32:38,772.772 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.bias loaded from image_encoder.module.head.bias of shape (1000,) 2022-03-17 13:32:38,772.772 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.weight loaded from image_encoder.module.head.weight of shape (1000, 768) 2022-03-17 13:32:38,772.772 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.bias loaded from image_encoder.module.patch_embed.proj.bias of shape (768,) 2022-03-17 13:32:38,772.772 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.weight loaded from image_encoder.module.patch_embed.proj.weight of shape (768, 3, 16, 16) 2022-03-17 13:32:38,772.772 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.pos_embed loaded from image_encoder.module.pos_embed of shape (1, 577, 768) 2022-03-17 13:32:38,772.772 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.bias loaded from bert.caption_pooler.dense.bias of shape (768,) 2022-03-17 13:32:38,772.772 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.weight loaded from bert.caption_pooler.dense.weight of shape (768, 768) 2022-03-17 13:32:38,772.772 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.bias loaded from bert.decoder.layer.0.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:32:38,772.772 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.weight loaded from bert.decoder.layer.0.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:32:38,773.773 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.bias loaded from bert.decoder.layer.0.attention.output.dense.bias of shape (768,) 2022-03-17 13:32:38,773.773 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.weight loaded from bert.decoder.layer.0.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:32:38,773.773 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.bias loaded from bert.decoder.layer.0.attention.self.key.bias of shape (768,) 2022-03-17 13:32:38,773.773 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.weight loaded from bert.decoder.layer.0.attention.self.key.weight of shape (768, 768) 2022-03-17 13:32:38,773.773 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.bias loaded from bert.decoder.layer.0.attention.self.query.bias of shape (768,) 2022-03-17 13:32:38,773.773 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.weight loaded from bert.decoder.layer.0.attention.self.query.weight of shape (768, 768) 2022-03-17 13:32:38,773.773 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.bias loaded from bert.decoder.layer.0.attention.self.value.bias of shape (768,) 2022-03-17 13:32:38,773.773 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.weight loaded from bert.decoder.layer.0.attention.self.value.weight of shape (768, 768) 2022-03-17 13:32:38,773.773 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.bias loaded from bert.decoder.layer.0.intermediate.dense.bias of shape (3072,) 2022-03-17 13:32:38,773.773 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.weight loaded from bert.decoder.layer.0.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:32:38,773.773 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.bias loaded from bert.decoder.layer.0.output.LayerNorm.bias of shape (768,) 2022-03-17 13:32:38,773.773 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.weight loaded from bert.decoder.layer.0.output.LayerNorm.weight of shape (768,) 2022-03-17 13:32:38,773.773 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.bias loaded from bert.decoder.layer.0.output.dense.bias of shape (768,) 2022-03-17 13:32:38,774.774 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.weight loaded from bert.decoder.layer.0.output.dense.weight of shape (768, 3072) 2022-03-17 13:32:38,774.774 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.bias loaded from bert.decoder.layer.1.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:32:38,774.774 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.weight loaded from bert.decoder.layer.1.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:32:38,774.774 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.bias loaded from bert.decoder.layer.1.attention.output.dense.bias of shape (768,) 2022-03-17 13:32:38,774.774 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.weight loaded from bert.decoder.layer.1.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:32:38,774.774 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.bias loaded from bert.decoder.layer.1.attention.self.key.bias of shape (768,) 2022-03-17 13:32:38,774.774 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.weight loaded from bert.decoder.layer.1.attention.self.key.weight of shape (768, 768) 2022-03-17 13:32:38,774.774 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.bias loaded from bert.decoder.layer.1.attention.self.query.bias of shape (768,) 2022-03-17 13:32:38,774.774 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.weight loaded from bert.decoder.layer.1.attention.self.query.weight of shape (768, 768) 2022-03-17 13:32:38,774.774 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.bias loaded from bert.decoder.layer.1.attention.self.value.bias of shape (768,) 2022-03-17 13:32:38,774.774 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.weight loaded from bert.decoder.layer.1.attention.self.value.weight of shape (768, 768) 2022-03-17 13:32:38,774.774 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.bias loaded from bert.decoder.layer.1.intermediate.dense.bias of shape (3072,) 2022-03-17 13:32:38,774.774 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.weight loaded from bert.decoder.layer.1.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:32:38,774.774 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.bias loaded from bert.decoder.layer.1.output.LayerNorm.bias of shape (768,) 2022-03-17 13:32:38,775.775 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.weight loaded from bert.decoder.layer.1.output.LayerNorm.weight of shape (768,) 2022-03-17 13:32:38,775.775 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.bias loaded from bert.decoder.layer.1.output.dense.bias of shape (768,) 2022-03-17 13:32:38,775.775 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.weight loaded from bert.decoder.layer.1.output.dense.weight of shape (768, 3072) 2022-03-17 13:32:38,775.775 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.bias loaded from bert.decoder.layer.2.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:32:38,775.775 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.weight loaded from bert.decoder.layer.2.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:32:38,775.775 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.bias loaded from bert.decoder.layer.2.attention.output.dense.bias of shape (768,) 2022-03-17 13:32:38,775.775 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.weight loaded from bert.decoder.layer.2.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:32:38,775.775 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.bias loaded from bert.decoder.layer.2.attention.self.key.bias of shape (768,) 2022-03-17 13:32:38,775.775 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.weight loaded from bert.decoder.layer.2.attention.self.key.weight of shape (768, 768) 2022-03-17 13:32:38,775.775 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.bias loaded from bert.decoder.layer.2.attention.self.query.bias of shape (768,) 2022-03-17 13:32:38,775.775 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.weight loaded from bert.decoder.layer.2.attention.self.query.weight of shape (768, 768) 2022-03-17 13:32:38,775.775 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.bias loaded from bert.decoder.layer.2.attention.self.value.bias of shape (768,) 2022-03-17 13:32:38,775.775 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.weight loaded from bert.decoder.layer.2.attention.self.value.weight of shape (768, 768) 2022-03-17 13:32:38,776.776 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.bias loaded from bert.decoder.layer.2.intermediate.dense.bias of shape (3072,) 2022-03-17 13:32:38,776.776 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.weight loaded from bert.decoder.layer.2.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:32:38,776.776 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.bias loaded from bert.decoder.layer.2.output.LayerNorm.bias of shape (768,) 2022-03-17 13:32:38,776.776 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.weight loaded from bert.decoder.layer.2.output.LayerNorm.weight of shape (768,) 2022-03-17 13:32:38,776.776 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.bias loaded from bert.decoder.layer.2.output.dense.bias of shape (768,) 2022-03-17 13:32:38,776.776 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.weight loaded from bert.decoder.layer.2.output.dense.weight of shape (768, 3072) 2022-03-17 13:32:38,776.776 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.bias loaded from bert.decoder.layer.3.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:32:38,776.776 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.weight loaded from bert.decoder.layer.3.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:32:38,776.776 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.bias loaded from bert.decoder.layer.3.attention.output.dense.bias of shape (768,) 2022-03-17 13:32:38,776.776 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.weight loaded from bert.decoder.layer.3.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:32:38,776.776 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.bias loaded from bert.decoder.layer.3.attention.self.key.bias of shape (768,) 2022-03-17 13:32:38,776.776 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.weight loaded from bert.decoder.layer.3.attention.self.key.weight of shape (768, 768) 2022-03-17 13:32:38,776.776 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.bias loaded from bert.decoder.layer.3.attention.self.query.bias of shape (768,) 2022-03-17 13:32:38,776.776 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.weight loaded from bert.decoder.layer.3.attention.self.query.weight of shape (768, 768) 2022-03-17 13:32:38,777.777 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.bias loaded from bert.decoder.layer.3.attention.self.value.bias of shape (768,) 2022-03-17 13:32:38,777.777 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.weight loaded from bert.decoder.layer.3.attention.self.value.weight of shape (768, 768) 2022-03-17 13:32:38,777.777 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.bias loaded from bert.decoder.layer.3.intermediate.dense.bias of shape (3072,) 2022-03-17 13:32:38,777.777 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.weight loaded from bert.decoder.layer.3.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:32:38,777.777 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.bias loaded from bert.decoder.layer.3.output.LayerNorm.bias of shape (768,) 2022-03-17 13:32:38,777.777 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.weight loaded from bert.decoder.layer.3.output.LayerNorm.weight of shape (768,) 2022-03-17 13:32:38,777.777 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.bias loaded from bert.decoder.layer.3.output.dense.bias of shape (768,) 2022-03-17 13:32:38,777.777 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.weight loaded from bert.decoder.layer.3.output.dense.weight of shape (768, 3072) 2022-03-17 13:32:38,777.777 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.bias loaded from bert.embeddings.LayerNorm.bias of shape (768,) 2022-03-17 13:32:38,777.777 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.weight loaded from bert.embeddings.LayerNorm.weight of shape (768,) 2022-03-17 13:32:38,778.778 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.position_embeddings.weight loaded from bert.embeddings.position_embeddings.weight of shape (512, 768) 2022-03-17 13:32:38,778.778 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.token_type_embeddings.weight loaded from bert.embeddings.token_type_embeddings.weight of shape (2, 768) 2022-03-17 13:32:38,778.778 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.word_embeddings.weight loaded from bert.embeddings.word_embeddings.weight of shape (30522, 768) 2022-03-17 13:32:38,778.778 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.bias loaded from bert.encoder.blocks.0.attn.proj.bias of shape (768,) 2022-03-17 13:32:38,778.778 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.weight loaded from bert.encoder.blocks.0.attn.proj.weight of shape (768, 768) 2022-03-17 13:32:38,778.778 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.bias loaded from bert.encoder.blocks.0.attn.qkv.bias of shape (2304,) 2022-03-17 13:32:38,778.778 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.weight loaded from bert.encoder.blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:32:38,778.778 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.bias loaded from bert.encoder.blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-17 13:32:38,778.778 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.weight loaded from bert.encoder.blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:32:38,779.779 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.bias loaded from bert.encoder.blocks.0.mlp.fc2.bias of shape (768,) 2022-03-17 13:32:38,779.779 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.weight loaded from bert.encoder.blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:32:38,779.779 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.bias loaded from bert.encoder.blocks.0.norm1.bias of shape (768,) 2022-03-17 13:32:38,779.779 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.weight loaded from bert.encoder.blocks.0.norm1.weight of shape (768,) 2022-03-17 13:32:38,779.779 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.bias loaded from bert.encoder.blocks.0.norm2.bias of shape (768,) 2022-03-17 13:32:38,779.779 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.weight loaded from bert.encoder.blocks.0.norm2.weight of shape (768,) 2022-03-17 13:32:38,779.779 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.bias loaded from bert.encoder.blocks.1.attn.proj.bias of shape (768,) 2022-03-17 13:32:38,779.779 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.weight loaded from bert.encoder.blocks.1.attn.proj.weight of shape (768, 768) 2022-03-17 13:32:38,779.779 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.bias loaded from bert.encoder.blocks.1.attn.qkv.bias of shape (2304,) 2022-03-17 13:32:38,779.779 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.weight loaded from bert.encoder.blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:32:38,779.779 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.bias loaded from bert.encoder.blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-17 13:32:38,779.779 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.weight loaded from bert.encoder.blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:32:38,780.780 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.bias loaded from bert.encoder.blocks.1.mlp.fc2.bias of shape (768,) 2022-03-17 13:32:38,780.780 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.weight loaded from bert.encoder.blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:32:38,780.780 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.bias loaded from bert.encoder.blocks.1.norm1.bias of shape (768,) 2022-03-17 13:32:38,780.780 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.weight loaded from bert.encoder.blocks.1.norm1.weight of shape (768,) 2022-03-17 13:32:38,780.780 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.bias loaded from bert.encoder.blocks.1.norm2.bias of shape (768,) 2022-03-17 13:32:38,780.780 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.weight loaded from bert.encoder.blocks.1.norm2.weight of shape (768,) 2022-03-17 13:32:38,780.780 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.bias loaded from bert.encoder.blocks.10.attn.proj.bias of shape (768,) 2022-03-17 13:32:38,780.780 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.weight loaded from bert.encoder.blocks.10.attn.proj.weight of shape (768, 768) 2022-03-17 13:32:38,780.780 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.bias loaded from bert.encoder.blocks.10.attn.qkv.bias of shape (2304,) 2022-03-17 13:32:38,780.780 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.weight loaded from bert.encoder.blocks.10.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:32:38,780.780 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.bias loaded from bert.encoder.blocks.10.mlp.fc1.bias of shape (3072,) 2022-03-17 13:32:38,780.780 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.weight loaded from bert.encoder.blocks.10.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:32:38,781.781 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.bias loaded from bert.encoder.blocks.10.mlp.fc2.bias of shape (768,) 2022-03-17 13:32:38,781.781 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.weight loaded from bert.encoder.blocks.10.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:32:38,781.781 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.bias loaded from bert.encoder.blocks.10.norm1.bias of shape (768,) 2022-03-17 13:32:38,781.781 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.weight loaded from bert.encoder.blocks.10.norm1.weight of shape (768,) 2022-03-17 13:32:38,781.781 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.bias loaded from bert.encoder.blocks.10.norm2.bias of shape (768,) 2022-03-17 13:32:38,781.781 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.weight loaded from bert.encoder.blocks.10.norm2.weight of shape (768,) 2022-03-17 13:32:38,781.781 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.bias loaded from bert.encoder.blocks.11.attn.proj.bias of shape (768,) 2022-03-17 13:32:38,781.781 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.weight loaded from bert.encoder.blocks.11.attn.proj.weight of shape (768, 768) 2022-03-17 13:32:38,781.781 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.bias loaded from bert.encoder.blocks.11.attn.qkv.bias of shape (2304,) 2022-03-17 13:32:38,781.781 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.weight loaded from bert.encoder.blocks.11.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:32:38,781.781 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.bias loaded from bert.encoder.blocks.11.mlp.fc1.bias of shape (3072,) 2022-03-17 13:32:38,781.781 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.weight loaded from bert.encoder.blocks.11.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:32:38,782.782 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.bias loaded from bert.encoder.blocks.11.mlp.fc2.bias of shape (768,) 2022-03-17 13:32:38,782.782 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.weight loaded from bert.encoder.blocks.11.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:32:38,782.782 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.bias loaded from bert.encoder.blocks.11.norm1.bias of shape (768,) 2022-03-17 13:32:38,782.782 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.weight loaded from bert.encoder.blocks.11.norm1.weight of shape (768,) 2022-03-17 13:32:38,782.782 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.bias loaded from bert.encoder.blocks.11.norm2.bias of shape (768,) 2022-03-17 13:32:38,782.782 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.weight loaded from bert.encoder.blocks.11.norm2.weight of shape (768,) 2022-03-17 13:32:38,782.782 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.bias loaded from bert.encoder.blocks.2.attn.proj.bias of shape (768,) 2022-03-17 13:32:38,782.782 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.weight loaded from bert.encoder.blocks.2.attn.proj.weight of shape (768, 768) 2022-03-17 13:32:38,782.782 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.bias loaded from bert.encoder.blocks.2.attn.qkv.bias of shape (2304,) 2022-03-17 13:32:38,782.782 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.weight loaded from bert.encoder.blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:32:38,782.782 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.bias loaded from bert.encoder.blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-17 13:32:38,782.782 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.weight loaded from bert.encoder.blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:32:38,782.782 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.bias loaded from bert.encoder.blocks.2.mlp.fc2.bias of shape (768,) 2022-03-17 13:32:38,783.783 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.weight loaded from bert.encoder.blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:32:38,783.783 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.bias loaded from bert.encoder.blocks.2.norm1.bias of shape (768,) 2022-03-17 13:32:38,783.783 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.weight loaded from bert.encoder.blocks.2.norm1.weight of shape (768,) 2022-03-17 13:32:38,783.783 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.bias loaded from bert.encoder.blocks.2.norm2.bias of shape (768,) 2022-03-17 13:32:38,783.783 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.weight loaded from bert.encoder.blocks.2.norm2.weight of shape (768,) 2022-03-17 13:32:38,783.783 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.bias loaded from bert.encoder.blocks.3.attn.proj.bias of shape (768,) 2022-03-17 13:32:38,783.783 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.weight loaded from bert.encoder.blocks.3.attn.proj.weight of shape (768, 768) 2022-03-17 13:32:38,783.783 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.bias loaded from bert.encoder.blocks.3.attn.qkv.bias of shape (2304,) 2022-03-17 13:32:38,783.783 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.weight loaded from bert.encoder.blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:32:38,783.783 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.bias loaded from bert.encoder.blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-17 13:32:38,783.783 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.weight loaded from bert.encoder.blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:32:38,783.783 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.bias loaded from bert.encoder.blocks.3.mlp.fc2.bias of shape (768,) 2022-03-17 13:32:38,783.783 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.weight loaded from bert.encoder.blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:32:38,784.784 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.bias loaded from bert.encoder.blocks.3.norm1.bias of shape (768,) 2022-03-17 13:32:38,784.784 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.weight loaded from bert.encoder.blocks.3.norm1.weight of shape (768,) 2022-03-17 13:32:38,784.784 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.bias loaded from bert.encoder.blocks.3.norm2.bias of shape (768,) 2022-03-17 13:32:38,784.784 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.weight loaded from bert.encoder.blocks.3.norm2.weight of shape (768,) 2022-03-17 13:32:38,784.784 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.bias loaded from bert.encoder.blocks.4.attn.proj.bias of shape (768,) 2022-03-17 13:32:38,784.784 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.weight loaded from bert.encoder.blocks.4.attn.proj.weight of shape (768, 768) 2022-03-17 13:32:38,784.784 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.bias loaded from bert.encoder.blocks.4.attn.qkv.bias of shape (2304,) 2022-03-17 13:32:38,784.784 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.weight loaded from bert.encoder.blocks.4.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:32:38,784.784 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.bias loaded from bert.encoder.blocks.4.mlp.fc1.bias of shape (3072,) 2022-03-17 13:32:38,784.784 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.weight loaded from bert.encoder.blocks.4.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:32:38,784.784 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.bias loaded from bert.encoder.blocks.4.mlp.fc2.bias of shape (768,) 2022-03-17 13:32:38,784.784 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.weight loaded from bert.encoder.blocks.4.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:32:38,784.784 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.bias loaded from bert.encoder.blocks.4.norm1.bias of shape (768,) 2022-03-17 13:32:38,784.784 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.weight loaded from bert.encoder.blocks.4.norm1.weight of shape (768,) 2022-03-17 13:32:38,785.785 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.bias loaded from bert.encoder.blocks.4.norm2.bias of shape (768,) 2022-03-17 13:32:38,785.785 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.weight loaded from bert.encoder.blocks.4.norm2.weight of shape (768,) 2022-03-17 13:32:38,785.785 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.bias loaded from bert.encoder.blocks.5.attn.proj.bias of shape (768,) 2022-03-17 13:32:38,785.785 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.weight loaded from bert.encoder.blocks.5.attn.proj.weight of shape (768, 768) 2022-03-17 13:32:38,785.785 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.bias loaded from bert.encoder.blocks.5.attn.qkv.bias of shape (2304,) 2022-03-17 13:32:38,785.785 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.weight loaded from bert.encoder.blocks.5.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:32:38,785.785 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.bias loaded from bert.encoder.blocks.5.mlp.fc1.bias of shape (3072,) 2022-03-17 13:32:38,785.785 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.weight loaded from bert.encoder.blocks.5.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:32:38,785.785 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.bias loaded from bert.encoder.blocks.5.mlp.fc2.bias of shape (768,) 2022-03-17 13:32:38,785.785 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.weight loaded from bert.encoder.blocks.5.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:32:38,785.785 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.bias loaded from bert.encoder.blocks.5.norm1.bias of shape (768,) 2022-03-17 13:32:38,785.785 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.weight loaded from bert.encoder.blocks.5.norm1.weight of shape (768,) 2022-03-17 13:32:38,785.785 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.bias loaded from bert.encoder.blocks.5.norm2.bias of shape (768,) 2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.weight loaded from bert.encoder.blocks.5.norm2.weight of shape (768,) 2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.bias loaded from bert.encoder.blocks.6.attn.proj.bias of shape (768,) 2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.weight loaded from bert.encoder.blocks.6.attn.proj.weight of shape (768, 768) 2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.bias loaded from bert.encoder.blocks.6.attn.qkv.bias of shape (2304,) 2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.weight loaded from bert.encoder.blocks.6.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.bias loaded from bert.encoder.blocks.6.mlp.fc1.bias of shape (3072,) 2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.weight loaded from bert.encoder.blocks.6.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.bias loaded from bert.encoder.blocks.6.mlp.fc2.bias of shape (768,) 2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.weight loaded from bert.encoder.blocks.6.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.bias loaded from bert.encoder.blocks.6.norm1.bias of shape (768,) 2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.weight loaded from bert.encoder.blocks.6.norm1.weight of shape (768,) 2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.bias loaded from bert.encoder.blocks.6.norm2.bias of shape (768,) 2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.weight loaded from bert.encoder.blocks.6.norm2.weight of shape (768,) 2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.bias loaded from bert.encoder.blocks.7.attn.proj.bias of shape (768,) 2022-03-17 13:32:38,786.786 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.weight loaded from bert.encoder.blocks.7.attn.proj.weight of shape (768, 768) 2022-03-17 13:32:38,787.787 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.bias loaded from bert.encoder.blocks.7.attn.qkv.bias of shape (2304,) 2022-03-17 13:32:38,787.787 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.weight loaded from bert.encoder.blocks.7.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:32:38,787.787 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.bias loaded from bert.encoder.blocks.7.mlp.fc1.bias of shape (3072,) 2022-03-17 13:32:38,787.787 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.weight loaded from bert.encoder.blocks.7.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:32:38,787.787 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.bias loaded from bert.encoder.blocks.7.mlp.fc2.bias of shape (768,) 2022-03-17 13:32:38,787.787 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.weight loaded from bert.encoder.blocks.7.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:32:38,787.787 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.bias loaded from bert.encoder.blocks.7.norm1.bias of shape (768,) 2022-03-17 13:32:38,787.787 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.weight loaded from bert.encoder.blocks.7.norm1.weight of shape (768,) 2022-03-17 13:32:38,787.787 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.bias loaded from bert.encoder.blocks.7.norm2.bias of shape (768,) 2022-03-17 13:32:38,787.787 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.weight loaded from bert.encoder.blocks.7.norm2.weight of shape (768,) 2022-03-17 13:32:38,787.787 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.bias loaded from bert.encoder.blocks.8.attn.proj.bias of shape (768,) 2022-03-17 13:32:38,787.787 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.weight loaded from bert.encoder.blocks.8.attn.proj.weight of shape (768, 768) 2022-03-17 13:32:38,787.787 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.bias loaded from bert.encoder.blocks.8.attn.qkv.bias of shape (2304,) 2022-03-17 13:32:38,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.weight loaded from bert.encoder.blocks.8.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:32:38,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.bias loaded from bert.encoder.blocks.8.mlp.fc1.bias of shape (3072,) 2022-03-17 13:32:38,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.weight loaded from bert.encoder.blocks.8.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:32:38,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.bias loaded from bert.encoder.blocks.8.mlp.fc2.bias of shape (768,) 2022-03-17 13:32:38,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.weight loaded from bert.encoder.blocks.8.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:32:38,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.bias loaded from bert.encoder.blocks.8.norm1.bias of shape (768,) 2022-03-17 13:32:38,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.weight loaded from bert.encoder.blocks.8.norm1.weight of shape (768,) 2022-03-17 13:32:38,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.bias loaded from bert.encoder.blocks.8.norm2.bias of shape (768,) 2022-03-17 13:32:38,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.weight loaded from bert.encoder.blocks.8.norm2.weight of shape (768,) 2022-03-17 13:32:38,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.bias loaded from bert.encoder.blocks.9.attn.proj.bias of shape (768,) 2022-03-17 13:32:38,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.weight loaded from bert.encoder.blocks.9.attn.proj.weight of shape (768, 768) 2022-03-17 13:32:38,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.bias loaded from bert.encoder.blocks.9.attn.qkv.bias of shape (2304,) 2022-03-17 13:32:38,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.weight loaded from bert.encoder.blocks.9.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:32:38,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.bias loaded from bert.encoder.blocks.9.mlp.fc1.bias of shape (3072,) 2022-03-17 13:32:38,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.weight loaded from bert.encoder.blocks.9.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:32:38,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.bias loaded from bert.encoder.blocks.9.mlp.fc2.bias of shape (768,) 2022-03-17 13:32:38,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.weight loaded from bert.encoder.blocks.9.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:32:38,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.bias loaded from bert.encoder.blocks.9.norm1.bias of shape (768,) 2022-03-17 13:32:38,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.weight loaded from bert.encoder.blocks.9.norm1.weight of shape (768,) 2022-03-17 13:32:38,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.bias loaded from bert.encoder.blocks.9.norm2.bias of shape (768,) 2022-03-17 13:32:38,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.weight loaded from bert.encoder.blocks.9.norm2.weight of shape (768,) 2022-03-17 13:32:38,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.bias loaded from bert.encoder.tag_blocks.0.attn.proj.bias of shape (768,) 2022-03-17 13:32:38,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.weight loaded from bert.encoder.tag_blocks.0.attn.proj.weight of shape (768, 768) 2022-03-17 13:32:38,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.bias loaded from bert.encoder.tag_blocks.0.attn.qkv.bias of shape (2304,) 2022-03-17 13:32:38,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.weight loaded from bert.encoder.tag_blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:32:38,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.bias loaded from bert.encoder.tag_blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-17 13:32:38,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.weight loaded from bert.encoder.tag_blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:32:38,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.bias loaded from bert.encoder.tag_blocks.0.mlp.fc2.bias of shape (768,) 2022-03-17 13:32:38,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.weight loaded from bert.encoder.tag_blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:32:38,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.bias loaded from bert.encoder.tag_blocks.0.norm1.bias of shape (768,) 2022-03-17 13:32:38,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.weight loaded from bert.encoder.tag_blocks.0.norm1.weight of shape (768,) 2022-03-17 13:32:38,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.bias loaded from bert.encoder.tag_blocks.0.norm2.bias of shape (768,) 2022-03-17 13:32:38,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.weight loaded from bert.encoder.tag_blocks.0.norm2.weight of shape (768,) 2022-03-17 13:32:38,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.bias loaded from bert.encoder.tag_blocks.1.attn.proj.bias of shape (768,) 2022-03-17 13:32:38,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.weight loaded from bert.encoder.tag_blocks.1.attn.proj.weight of shape (768, 768) 2022-03-17 13:32:38,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.bias loaded from bert.encoder.tag_blocks.1.attn.qkv.bias of shape (2304,) 2022-03-17 13:32:38,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.weight loaded from bert.encoder.tag_blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:32:38,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.bias loaded from bert.encoder.tag_blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-17 13:32:38,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.weight loaded from bert.encoder.tag_blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:32:38,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.bias loaded from bert.encoder.tag_blocks.1.mlp.fc2.bias of shape (768,) 2022-03-17 13:32:38,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.weight loaded from bert.encoder.tag_blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:32:38,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.bias loaded from bert.encoder.tag_blocks.1.norm1.bias of shape (768,) 2022-03-17 13:32:38,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.weight loaded from bert.encoder.tag_blocks.1.norm1.weight of shape (768,) 2022-03-17 13:32:38,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.bias loaded from bert.encoder.tag_blocks.1.norm2.bias of shape (768,) 2022-03-17 13:32:38,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.weight loaded from bert.encoder.tag_blocks.1.norm2.weight of shape (768,) 2022-03-17 13:32:38,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.bias loaded from bert.encoder.tag_blocks.2.attn.proj.bias of shape (768,) 2022-03-17 13:32:38,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.weight loaded from bert.encoder.tag_blocks.2.attn.proj.weight of shape (768, 768) 2022-03-17 13:32:38,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.bias loaded from bert.encoder.tag_blocks.2.attn.qkv.bias of shape (2304,) 2022-03-17 13:32:38,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.weight loaded from bert.encoder.tag_blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:32:38,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.bias loaded from bert.encoder.tag_blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-17 13:32:38,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.weight loaded from bert.encoder.tag_blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:32:38,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.bias loaded from bert.encoder.tag_blocks.2.mlp.fc2.bias of shape (768,) 2022-03-17 13:32:38,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.weight loaded from bert.encoder.tag_blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:32:38,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.bias loaded from bert.encoder.tag_blocks.2.norm1.bias of shape (768,) 2022-03-17 13:32:38,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.weight loaded from bert.encoder.tag_blocks.2.norm1.weight of shape (768,) 2022-03-17 13:32:38,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.bias loaded from bert.encoder.tag_blocks.2.norm2.bias of shape (768,) 2022-03-17 13:32:38,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.weight loaded from bert.encoder.tag_blocks.2.norm2.weight of shape (768,) 2022-03-17 13:32:38,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.bias loaded from bert.encoder.tag_blocks.3.attn.proj.bias of shape (768,) 2022-03-17 13:32:38,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.weight loaded from bert.encoder.tag_blocks.3.attn.proj.weight of shape (768, 768) 2022-03-17 13:32:38,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.bias loaded from bert.encoder.tag_blocks.3.attn.qkv.bias of shape (2304,) 2022-03-17 13:32:38,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.weight loaded from bert.encoder.tag_blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:32:38,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.bias loaded from bert.encoder.tag_blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-17 13:32:38,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.weight loaded from bert.encoder.tag_blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:32:38,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.bias loaded from bert.encoder.tag_blocks.3.mlp.fc2.bias of shape (768,) 2022-03-17 13:32:38,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.weight loaded from bert.encoder.tag_blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:32:38,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.bias loaded from bert.encoder.tag_blocks.3.norm1.bias of shape (768,) 2022-03-17 13:32:38,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.weight loaded from bert.encoder.tag_blocks.3.norm1.weight of shape (768,) 2022-03-17 13:32:38,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.bias loaded from bert.encoder.tag_blocks.3.norm2.bias of shape (768,) 2022-03-17 13:32:38,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.weight loaded from bert.encoder.tag_blocks.3.norm2.weight of shape (768,) 2022-03-17 13:32:38,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.bias loaded from bert.extra_embeddings.LayerNorm.bias of shape (768,) 2022-03-17 13:32:38,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.weight loaded from bert.extra_embeddings.LayerNorm.weight of shape (768,) 2022-03-17 13:32:38,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.position_embeddings.weight loaded from bert.extra_embeddings.position_embeddings.weight of shape (512, 768) 2022-03-17 13:32:38,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.token_type_embeddings.weight loaded from bert.extra_embeddings.token_type_embeddings.weight of shape (2, 768) 2022-03-17 13:32:38,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.word_embeddings.weight loaded from bert.extra_embeddings.word_embeddings.weight of shape (30522, 768) 2022-03-17 13:32:38,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.bias loaded from bert.pooler.dense.bias of shape (768,) 2022-03-17 13:32:38,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.weight loaded from bert.pooler.dense.weight of shape (768, 768) 2022-03-17 13:32:38,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.bias loaded from bert.tag_logit.predictions.bias of shape (30522,) 2022-03-17 13:32:38,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.decoder.weight loaded from bert.tag_logit.predictions.decoder.weight of shape (30522, 768) 2022-03-17 13:32:38,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.bias loaded from bert.tag_logit.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-17 13:32:38,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.weight loaded from bert.tag_logit.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-17 13:32:38,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.bias loaded from bert.tag_logit.predictions.transform.dense.bias of shape (768,) 2022-03-17 13:32:38,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.weight loaded from bert.tag_logit.predictions.transform.dense.weight of shape (768, 768) 2022-03-17 13:32:38,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.bias loaded from cls.predictions.bias of shape (30522,) 2022-03-17 13:32:38,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.decoder.weight loaded from cls.predictions.decoder.weight of shape (30522, 768) 2022-03-17 13:32:38,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.bias loaded from cls.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-17 13:32:38,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.weight loaded from cls.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-17 13:32:38,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.bias loaded from cls.predictions.transform.dense.bias of shape (768,) 2022-03-17 13:32:38,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.weight loaded from cls.predictions.transform.dense.weight of shape (768, 768) 2022-03-17 13:32:38,794.794 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 288; loaded = 288 2022-03-17 13:32:38,794.794 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = [] 2022-03-17 13:32:38,797.797 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = ['module.cls.predictions.decoder.weight']; total = 1 2022-03-17 13:32:38,959.959 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.cls.predictions.decoder.weight'] 2022-03-17 13:32:39,268.268 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = ; data = TaxCocoCaption; split = test; add_key = False; backend = cv; hold_buffer = 0; save_original = False 2022-03-17 13:32:39,294.294 2829:uni_pipeline.py:509 get_data_loader(): sampler = 2022-03-17 13:32:39,294.294 2829:uni_pipeline.py:1000 predict(): writing output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0066415.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv 2022-03-17 13:32:39,367.367 2829:uni_pipeline.py:907 predict_iter(): DatasetPlusTransform(dataset=, transform=Compose( Compose( ImageTransform2Dict(image_transform=Compose( ToPILImage() Resize(size=384, interpolation=PIL.Image.BICUBIC) CenterCrop(size=(384, 384)) ToTensor() Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) )) ) LoadLabel(data=TaxCocoCaption, split=test, version=vinvl) TransCaptionTensorizer(tensorizer=, pad_to_max=True, pad_image_to_max=True) )) uni_pipeline.py:908: 0%| | 0/10 [00:00 locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi pip.monitor_train() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train need_wait_models = self.pred_eval_intermediate_models() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models pred = self.ensure_predict(model_file=model_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict self.predict(model_file, predict_result_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict model = self.get_model(is_train=False) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. --- Logging error --- Traceback (most recent call last): File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit msg = self.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format return fmt.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format record.message = record.getMessage() File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage msg = str(self.msg) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__ return str(self.to_json_string()) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n" File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps **kw).encode(obj) File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode chunks = list(chunks) File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode yield from _iterencode_dict(o, _current_indent_level) File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict yield from chunks File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode o = _default(o) File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default raise TypeError(f'Object of type {o.__class__.__name__} ' TypeError: Object of type BertTokenizer is not JSON serializable Call stack: File "src/qd/pipeline.py", line 1368, in locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi pip.monitor_train() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train need_wait_models = self.pred_eval_intermediate_models() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models pred = self.ensure_predict(model_file=model_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict self.predict(model_file, predict_result_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict model = self.get_model(is_train=False) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. 2022-03-17 13:35:15,791.791 2829:modeling_bert.py:529 __init__(): TIMM Split image encoder load from pre-trained: True 2022-03-17 13:35:16,854.854 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:35:18,398.398 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:35:19,047.047 2829:modeling_bert.py:2677 __init__(): BertImgModel Image Dimension: 2054 2022-03-17 13:35:19,197.197 2829:modeling_bert.py:2688 __init__(): BertImgModel Image Dimension: 2054 2022-03-17 13:35:19,989.989 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight! Note that this might be replaced by pre-trained checkpoint later! 2022-03-17 13:35:19,989.989 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode. 2022-03-17 13:35:21,118.118 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:35:21,665.665 2829:checkpoint.py:240 load(): Loading checkpoint from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt 2022-03-17 13:35:23,674.674 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.cls_token loaded from image_encoder.module.cls_token of shape (1, 1, 768) 2022-03-17 13:35:23,675.675 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.bias loaded from image_encoder.module.head.bias of shape (1000,) 2022-03-17 13:35:23,675.675 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.weight loaded from image_encoder.module.head.weight of shape (1000, 768) 2022-03-17 13:35:23,675.675 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.bias loaded from image_encoder.module.patch_embed.proj.bias of shape (768,) 2022-03-17 13:35:23,675.675 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.weight loaded from image_encoder.module.patch_embed.proj.weight of shape (768, 3, 16, 16) 2022-03-17 13:35:23,675.675 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.pos_embed loaded from image_encoder.module.pos_embed of shape (1, 577, 768) 2022-03-17 13:35:23,675.675 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.bias loaded from bert.caption_pooler.dense.bias of shape (768,) 2022-03-17 13:35:23,675.675 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.weight loaded from bert.caption_pooler.dense.weight of shape (768, 768) 2022-03-17 13:35:23,675.675 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.bias loaded from bert.decoder.layer.0.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:35:23,676.676 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.weight loaded from bert.decoder.layer.0.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:35:23,676.676 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.bias loaded from bert.decoder.layer.0.attention.output.dense.bias of shape (768,) 2022-03-17 13:35:23,676.676 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.weight loaded from bert.decoder.layer.0.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:35:23,676.676 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.bias loaded from bert.decoder.layer.0.attention.self.key.bias of shape (768,) 2022-03-17 13:35:23,676.676 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.weight loaded from bert.decoder.layer.0.attention.self.key.weight of shape (768, 768) 2022-03-17 13:35:23,676.676 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.bias loaded from bert.decoder.layer.0.attention.self.query.bias of shape (768,) 2022-03-17 13:35:23,676.676 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.weight loaded from bert.decoder.layer.0.attention.self.query.weight of shape (768, 768) 2022-03-17 13:35:23,676.676 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.bias loaded from bert.decoder.layer.0.attention.self.value.bias of shape (768,) 2022-03-17 13:35:23,676.676 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.weight loaded from bert.decoder.layer.0.attention.self.value.weight of shape (768, 768) 2022-03-17 13:35:23,676.676 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.bias loaded from bert.decoder.layer.0.intermediate.dense.bias of shape (3072,) 2022-03-17 13:35:23,676.676 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.weight loaded from bert.decoder.layer.0.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:35:23,676.676 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.bias loaded from bert.decoder.layer.0.output.LayerNorm.bias of shape (768,) 2022-03-17 13:35:23,676.676 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.weight loaded from bert.decoder.layer.0.output.LayerNorm.weight of shape (768,) 2022-03-17 13:35:23,676.676 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.bias loaded from bert.decoder.layer.0.output.dense.bias of shape (768,) 2022-03-17 13:35:23,677.677 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.weight loaded from bert.decoder.layer.0.output.dense.weight of shape (768, 3072) 2022-03-17 13:35:23,677.677 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.bias loaded from bert.decoder.layer.1.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:35:23,677.677 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.weight loaded from bert.decoder.layer.1.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:35:23,677.677 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.bias loaded from bert.decoder.layer.1.attention.output.dense.bias of shape (768,) 2022-03-17 13:35:23,677.677 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.weight loaded from bert.decoder.layer.1.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:35:23,677.677 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.bias loaded from bert.decoder.layer.1.attention.self.key.bias of shape (768,) 2022-03-17 13:35:23,677.677 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.weight loaded from bert.decoder.layer.1.attention.self.key.weight of shape (768, 768) 2022-03-17 13:35:23,677.677 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.bias loaded from bert.decoder.layer.1.attention.self.query.bias of shape (768,) 2022-03-17 13:35:23,677.677 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.weight loaded from bert.decoder.layer.1.attention.self.query.weight of shape (768, 768) 2022-03-17 13:35:23,677.677 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.bias loaded from bert.decoder.layer.1.attention.self.value.bias of shape (768,) 2022-03-17 13:35:23,677.677 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.weight loaded from bert.decoder.layer.1.attention.self.value.weight of shape (768, 768) 2022-03-17 13:35:23,677.677 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.bias loaded from bert.decoder.layer.1.intermediate.dense.bias of shape (3072,) 2022-03-17 13:35:23,677.677 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.weight loaded from bert.decoder.layer.1.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:35:23,677.677 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.bias loaded from bert.decoder.layer.1.output.LayerNorm.bias of shape (768,) 2022-03-17 13:35:23,678.678 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.weight loaded from bert.decoder.layer.1.output.LayerNorm.weight of shape (768,) 2022-03-17 13:35:23,678.678 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.bias loaded from bert.decoder.layer.1.output.dense.bias of shape (768,) 2022-03-17 13:35:23,678.678 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.weight loaded from bert.decoder.layer.1.output.dense.weight of shape (768, 3072) 2022-03-17 13:35:23,678.678 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.bias loaded from bert.decoder.layer.2.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:35:23,678.678 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.weight loaded from bert.decoder.layer.2.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:35:23,678.678 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.bias loaded from bert.decoder.layer.2.attention.output.dense.bias of shape (768,) 2022-03-17 13:35:23,678.678 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.weight loaded from bert.decoder.layer.2.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:35:23,678.678 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.bias loaded from bert.decoder.layer.2.attention.self.key.bias of shape (768,) 2022-03-17 13:35:23,678.678 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.weight loaded from bert.decoder.layer.2.attention.self.key.weight of shape (768, 768) 2022-03-17 13:35:23,678.678 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.bias loaded from bert.decoder.layer.2.attention.self.query.bias of shape (768,) 2022-03-17 13:35:23,678.678 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.weight loaded from bert.decoder.layer.2.attention.self.query.weight of shape (768, 768) 2022-03-17 13:35:23,678.678 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.bias loaded from bert.decoder.layer.2.attention.self.value.bias of shape (768,) 2022-03-17 13:35:23,678.678 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.weight loaded from bert.decoder.layer.2.attention.self.value.weight of shape (768, 768) 2022-03-17 13:35:23,678.678 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.bias loaded from bert.decoder.layer.2.intermediate.dense.bias of shape (3072,) 2022-03-17 13:35:23,679.679 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.weight loaded from bert.decoder.layer.2.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:35:23,679.679 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.bias loaded from bert.decoder.layer.2.output.LayerNorm.bias of shape (768,) 2022-03-17 13:35:23,679.679 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.weight loaded from bert.decoder.layer.2.output.LayerNorm.weight of shape (768,) 2022-03-17 13:35:23,679.679 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.bias loaded from bert.decoder.layer.2.output.dense.bias of shape (768,) 2022-03-17 13:35:23,679.679 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.weight loaded from bert.decoder.layer.2.output.dense.weight of shape (768, 3072) 2022-03-17 13:35:23,679.679 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.bias loaded from bert.decoder.layer.3.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:35:23,679.679 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.weight loaded from bert.decoder.layer.3.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:35:23,679.679 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.bias loaded from bert.decoder.layer.3.attention.output.dense.bias of shape (768,) 2022-03-17 13:35:23,679.679 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.weight loaded from bert.decoder.layer.3.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:35:23,679.679 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.bias loaded from bert.decoder.layer.3.attention.self.key.bias of shape (768,) 2022-03-17 13:35:23,679.679 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.weight loaded from bert.decoder.layer.3.attention.self.key.weight of shape (768, 768) 2022-03-17 13:35:23,679.679 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.bias loaded from bert.decoder.layer.3.attention.self.query.bias of shape (768,) 2022-03-17 13:35:23,679.679 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.weight loaded from bert.decoder.layer.3.attention.self.query.weight of shape (768, 768) 2022-03-17 13:35:23,679.679 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.bias loaded from bert.decoder.layer.3.attention.self.value.bias of shape (768,) 2022-03-17 13:35:23,680.680 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.weight loaded from bert.decoder.layer.3.attention.self.value.weight of shape (768, 768) 2022-03-17 13:35:23,680.680 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.bias loaded from bert.decoder.layer.3.intermediate.dense.bias of shape (3072,) 2022-03-17 13:35:23,680.680 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.weight loaded from bert.decoder.layer.3.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:35:23,680.680 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.bias loaded from bert.decoder.layer.3.output.LayerNorm.bias of shape (768,) 2022-03-17 13:35:23,680.680 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.weight loaded from bert.decoder.layer.3.output.LayerNorm.weight of shape (768,) 2022-03-17 13:35:23,680.680 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.bias loaded from bert.decoder.layer.3.output.dense.bias of shape (768,) 2022-03-17 13:35:23,680.680 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.weight loaded from bert.decoder.layer.3.output.dense.weight of shape (768, 3072) 2022-03-17 13:35:23,680.680 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.bias loaded from bert.embeddings.LayerNorm.bias of shape (768,) 2022-03-17 13:35:23,680.680 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.weight loaded from bert.embeddings.LayerNorm.weight of shape (768,) 2022-03-17 13:35:23,680.680 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.position_embeddings.weight loaded from bert.embeddings.position_embeddings.weight of shape (512, 768) 2022-03-17 13:35:23,680.680 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.token_type_embeddings.weight loaded from bert.embeddings.token_type_embeddings.weight of shape (2, 768) 2022-03-17 13:35:23,680.680 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.word_embeddings.weight loaded from bert.embeddings.word_embeddings.weight of shape (30522, 768) 2022-03-17 13:35:23,680.680 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.bias loaded from bert.encoder.blocks.0.attn.proj.bias of shape (768,) 2022-03-17 13:35:23,680.680 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.weight loaded from bert.encoder.blocks.0.attn.proj.weight of shape (768, 768) 2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.bias loaded from bert.encoder.blocks.0.attn.qkv.bias of shape (2304,) 2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.weight loaded from bert.encoder.blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.bias loaded from bert.encoder.blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.weight loaded from bert.encoder.blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.bias loaded from bert.encoder.blocks.0.mlp.fc2.bias of shape (768,) 2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.weight loaded from bert.encoder.blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.bias loaded from bert.encoder.blocks.0.norm1.bias of shape (768,) 2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.weight loaded from bert.encoder.blocks.0.norm1.weight of shape (768,) 2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.bias loaded from bert.encoder.blocks.0.norm2.bias of shape (768,) 2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.weight loaded from bert.encoder.blocks.0.norm2.weight of shape (768,) 2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.bias loaded from bert.encoder.blocks.1.attn.proj.bias of shape (768,) 2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.weight loaded from bert.encoder.blocks.1.attn.proj.weight of shape (768, 768) 2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.bias loaded from bert.encoder.blocks.1.attn.qkv.bias of shape (2304,) 2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.weight loaded from bert.encoder.blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:35:23,681.681 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.bias loaded from bert.encoder.blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-17 13:35:23,682.682 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.weight loaded from bert.encoder.blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:35:23,682.682 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.bias loaded from bert.encoder.blocks.1.mlp.fc2.bias of shape (768,) 2022-03-17 13:35:23,682.682 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.weight loaded from bert.encoder.blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:35:23,682.682 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.bias loaded from bert.encoder.blocks.1.norm1.bias of shape (768,) 2022-03-17 13:35:23,682.682 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.weight loaded from bert.encoder.blocks.1.norm1.weight of shape (768,) 2022-03-17 13:35:23,682.682 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.bias loaded from bert.encoder.blocks.1.norm2.bias of shape (768,) 2022-03-17 13:35:23,682.682 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.weight loaded from bert.encoder.blocks.1.norm2.weight of shape (768,) 2022-03-17 13:35:23,682.682 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.bias loaded from bert.encoder.blocks.10.attn.proj.bias of shape (768,) 2022-03-17 13:35:23,682.682 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.weight loaded from bert.encoder.blocks.10.attn.proj.weight of shape (768, 768) 2022-03-17 13:35:23,682.682 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.bias loaded from bert.encoder.blocks.10.attn.qkv.bias of shape (2304,) 2022-03-17 13:35:23,682.682 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.weight loaded from bert.encoder.blocks.10.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:35:23,682.682 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.bias loaded from bert.encoder.blocks.10.mlp.fc1.bias of shape (3072,) 2022-03-17 13:35:23,682.682 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.weight loaded from bert.encoder.blocks.10.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:35:23,682.682 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.bias loaded from bert.encoder.blocks.10.mlp.fc2.bias of shape (768,) 2022-03-17 13:35:23,683.683 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.weight loaded from bert.encoder.blocks.10.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:35:23,683.683 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.bias loaded from bert.encoder.blocks.10.norm1.bias of shape (768,) 2022-03-17 13:35:23,683.683 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.weight loaded from bert.encoder.blocks.10.norm1.weight of shape (768,) 2022-03-17 13:35:23,683.683 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.bias loaded from bert.encoder.blocks.10.norm2.bias of shape (768,) 2022-03-17 13:35:23,683.683 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.weight loaded from bert.encoder.blocks.10.norm2.weight of shape (768,) 2022-03-17 13:35:23,683.683 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.bias loaded from bert.encoder.blocks.11.attn.proj.bias of shape (768,) 2022-03-17 13:35:23,683.683 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.weight loaded from bert.encoder.blocks.11.attn.proj.weight of shape (768, 768) 2022-03-17 13:35:23,683.683 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.bias loaded from bert.encoder.blocks.11.attn.qkv.bias of shape (2304,) 2022-03-17 13:35:23,683.683 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.weight loaded from bert.encoder.blocks.11.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:35:23,683.683 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.bias loaded from bert.encoder.blocks.11.mlp.fc1.bias of shape (3072,) 2022-03-17 13:35:23,683.683 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.weight loaded from bert.encoder.blocks.11.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:35:23,683.683 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.bias loaded from bert.encoder.blocks.11.mlp.fc2.bias of shape (768,) 2022-03-17 13:35:23,683.683 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.weight loaded from bert.encoder.blocks.11.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:35:23,683.683 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.bias loaded from bert.encoder.blocks.11.norm1.bias of shape (768,) 2022-03-17 13:35:23,684.684 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.weight loaded from bert.encoder.blocks.11.norm1.weight of shape (768,) 2022-03-17 13:35:23,684.684 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.bias loaded from bert.encoder.blocks.11.norm2.bias of shape (768,) 2022-03-17 13:35:23,684.684 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.weight loaded from bert.encoder.blocks.11.norm2.weight of shape (768,) 2022-03-17 13:35:23,684.684 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.bias loaded from bert.encoder.blocks.2.attn.proj.bias of shape (768,) 2022-03-17 13:35:23,684.684 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.weight loaded from bert.encoder.blocks.2.attn.proj.weight of shape (768, 768) 2022-03-17 13:35:23,684.684 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.bias loaded from bert.encoder.blocks.2.attn.qkv.bias of shape (2304,) 2022-03-17 13:35:23,684.684 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.weight loaded from bert.encoder.blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:35:23,684.684 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.bias loaded from bert.encoder.blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-17 13:35:23,684.684 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.weight loaded from bert.encoder.blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:35:23,684.684 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.bias loaded from bert.encoder.blocks.2.mlp.fc2.bias of shape (768,) 2022-03-17 13:35:23,684.684 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.weight loaded from bert.encoder.blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:35:23,684.684 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.bias loaded from bert.encoder.blocks.2.norm1.bias of shape (768,) 2022-03-17 13:35:23,684.684 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.weight loaded from bert.encoder.blocks.2.norm1.weight of shape (768,) 2022-03-17 13:35:23,684.684 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.bias loaded from bert.encoder.blocks.2.norm2.bias of shape (768,) 2022-03-17 13:35:23,685.685 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.weight loaded from bert.encoder.blocks.2.norm2.weight of shape (768,) 2022-03-17 13:35:23,685.685 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.bias loaded from bert.encoder.blocks.3.attn.proj.bias of shape (768,) 2022-03-17 13:35:23,685.685 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.weight loaded from bert.encoder.blocks.3.attn.proj.weight of shape (768, 768) 2022-03-17 13:35:23,685.685 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.bias loaded from bert.encoder.blocks.3.attn.qkv.bias of shape (2304,) 2022-03-17 13:35:23,685.685 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.weight loaded from bert.encoder.blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:35:23,685.685 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.bias loaded from bert.encoder.blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-17 13:35:23,685.685 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.weight loaded from bert.encoder.blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:35:23,685.685 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.bias loaded from bert.encoder.blocks.3.mlp.fc2.bias of shape (768,) 2022-03-17 13:35:23,685.685 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.weight loaded from bert.encoder.blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:35:23,685.685 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.bias loaded from bert.encoder.blocks.3.norm1.bias of shape (768,) 2022-03-17 13:35:23,685.685 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.weight loaded from bert.encoder.blocks.3.norm1.weight of shape (768,) 2022-03-17 13:35:23,685.685 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.bias loaded from bert.encoder.blocks.3.norm2.bias of shape (768,) 2022-03-17 13:35:23,685.685 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.weight loaded from bert.encoder.blocks.3.norm2.weight of shape (768,) 2022-03-17 13:35:23,685.685 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.bias loaded from bert.encoder.blocks.4.attn.proj.bias of shape (768,) 2022-03-17 13:35:23,686.686 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.weight loaded from bert.encoder.blocks.4.attn.proj.weight of shape (768, 768) 2022-03-17 13:35:23,686.686 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.bias loaded from bert.encoder.blocks.4.attn.qkv.bias of shape (2304,) 2022-03-17 13:35:23,686.686 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.weight loaded from bert.encoder.blocks.4.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:35:23,686.686 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.bias loaded from bert.encoder.blocks.4.mlp.fc1.bias of shape (3072,) 2022-03-17 13:35:23,686.686 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.weight loaded from bert.encoder.blocks.4.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:35:23,686.686 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.bias loaded from bert.encoder.blocks.4.mlp.fc2.bias of shape (768,) 2022-03-17 13:35:23,686.686 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.weight loaded from bert.encoder.blocks.4.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:35:23,686.686 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.bias loaded from bert.encoder.blocks.4.norm1.bias of shape (768,) 2022-03-17 13:35:23,686.686 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.weight loaded from bert.encoder.blocks.4.norm1.weight of shape (768,) 2022-03-17 13:35:23,686.686 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.bias loaded from bert.encoder.blocks.4.norm2.bias of shape (768,) 2022-03-17 13:35:23,686.686 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.weight loaded from bert.encoder.blocks.4.norm2.weight of shape (768,) 2022-03-17 13:35:23,686.686 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.bias loaded from bert.encoder.blocks.5.attn.proj.bias of shape (768,) 2022-03-17 13:35:23,686.686 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.weight loaded from bert.encoder.blocks.5.attn.proj.weight of shape (768, 768) 2022-03-17 13:35:23,686.686 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.bias loaded from bert.encoder.blocks.5.attn.qkv.bias of shape (2304,) 2022-03-17 13:35:23,687.687 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.weight loaded from bert.encoder.blocks.5.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:35:23,687.687 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.bias loaded from bert.encoder.blocks.5.mlp.fc1.bias of shape (3072,) 2022-03-17 13:35:23,687.687 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.weight loaded from bert.encoder.blocks.5.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:35:23,687.687 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.bias loaded from bert.encoder.blocks.5.mlp.fc2.bias of shape (768,) 2022-03-17 13:35:23,687.687 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.weight loaded from bert.encoder.blocks.5.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:35:23,687.687 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.bias loaded from bert.encoder.blocks.5.norm1.bias of shape (768,) 2022-03-17 13:35:23,687.687 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.weight loaded from bert.encoder.blocks.5.norm1.weight of shape (768,) 2022-03-17 13:35:23,687.687 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.bias loaded from bert.encoder.blocks.5.norm2.bias of shape (768,) 2022-03-17 13:35:23,687.687 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.weight loaded from bert.encoder.blocks.5.norm2.weight of shape (768,) 2022-03-17 13:35:23,687.687 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.bias loaded from bert.encoder.blocks.6.attn.proj.bias of shape (768,) 2022-03-17 13:35:23,687.687 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.weight loaded from bert.encoder.blocks.6.attn.proj.weight of shape (768, 768) 2022-03-17 13:35:23,687.687 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.bias loaded from bert.encoder.blocks.6.attn.qkv.bias of shape (2304,) 2022-03-17 13:35:23,687.687 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.weight loaded from bert.encoder.blocks.6.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:35:23,687.687 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.bias loaded from bert.encoder.blocks.6.mlp.fc1.bias of shape (3072,) 2022-03-17 13:35:23,688.688 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.weight loaded from bert.encoder.blocks.6.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:35:23,688.688 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.bias loaded from bert.encoder.blocks.6.mlp.fc2.bias of shape (768,) 2022-03-17 13:35:23,688.688 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.weight loaded from bert.encoder.blocks.6.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:35:23,688.688 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.bias loaded from bert.encoder.blocks.6.norm1.bias of shape (768,) 2022-03-17 13:35:23,688.688 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.weight loaded from bert.encoder.blocks.6.norm1.weight of shape (768,) 2022-03-17 13:35:23,688.688 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.bias loaded from bert.encoder.blocks.6.norm2.bias of shape (768,) 2022-03-17 13:35:23,688.688 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.weight loaded from bert.encoder.blocks.6.norm2.weight of shape (768,) 2022-03-17 13:35:23,688.688 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.bias loaded from bert.encoder.blocks.7.attn.proj.bias of shape (768,) 2022-03-17 13:35:23,688.688 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.weight loaded from bert.encoder.blocks.7.attn.proj.weight of shape (768, 768) 2022-03-17 13:35:23,688.688 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.bias loaded from bert.encoder.blocks.7.attn.qkv.bias of shape (2304,) 2022-03-17 13:35:23,688.688 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.weight loaded from bert.encoder.blocks.7.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:35:23,688.688 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.bias loaded from bert.encoder.blocks.7.mlp.fc1.bias of shape (3072,) 2022-03-17 13:35:23,688.688 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.weight loaded from bert.encoder.blocks.7.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:35:23,689.689 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.bias loaded from bert.encoder.blocks.7.mlp.fc2.bias of shape (768,) 2022-03-17 13:35:23,689.689 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.weight loaded from bert.encoder.blocks.7.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:35:23,689.689 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.bias loaded from bert.encoder.blocks.7.norm1.bias of shape (768,) 2022-03-17 13:35:23,689.689 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.weight loaded from bert.encoder.blocks.7.norm1.weight of shape (768,) 2022-03-17 13:35:23,689.689 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.bias loaded from bert.encoder.blocks.7.norm2.bias of shape (768,) 2022-03-17 13:35:23,689.689 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.weight loaded from bert.encoder.blocks.7.norm2.weight of shape (768,) 2022-03-17 13:35:23,689.689 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.bias loaded from bert.encoder.blocks.8.attn.proj.bias of shape (768,) 2022-03-17 13:35:23,689.689 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.weight loaded from bert.encoder.blocks.8.attn.proj.weight of shape (768, 768) 2022-03-17 13:35:23,689.689 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.bias loaded from bert.encoder.blocks.8.attn.qkv.bias of shape (2304,) 2022-03-17 13:35:23,689.689 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.weight loaded from bert.encoder.blocks.8.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:35:23,689.689 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.bias loaded from bert.encoder.blocks.8.mlp.fc1.bias of shape (3072,) 2022-03-17 13:35:23,689.689 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.weight loaded from bert.encoder.blocks.8.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:35:23,689.689 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.bias loaded from bert.encoder.blocks.8.mlp.fc2.bias of shape (768,) 2022-03-17 13:35:23,689.689 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.weight loaded from bert.encoder.blocks.8.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:35:23,690.690 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.bias loaded from bert.encoder.blocks.8.norm1.bias of shape (768,) 2022-03-17 13:35:23,690.690 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.weight loaded from bert.encoder.blocks.8.norm1.weight of shape (768,) 2022-03-17 13:35:23,690.690 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.bias loaded from bert.encoder.blocks.8.norm2.bias of shape (768,) 2022-03-17 13:35:23,690.690 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.weight loaded from bert.encoder.blocks.8.norm2.weight of shape (768,) 2022-03-17 13:35:23,690.690 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.bias loaded from bert.encoder.blocks.9.attn.proj.bias of shape (768,) 2022-03-17 13:35:23,690.690 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.weight loaded from bert.encoder.blocks.9.attn.proj.weight of shape (768, 768) 2022-03-17 13:35:23,690.690 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.bias loaded from bert.encoder.blocks.9.attn.qkv.bias of shape (2304,) 2022-03-17 13:35:23,690.690 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.weight loaded from bert.encoder.blocks.9.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:35:23,690.690 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.bias loaded from bert.encoder.blocks.9.mlp.fc1.bias of shape (3072,) 2022-03-17 13:35:23,690.690 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.weight loaded from bert.encoder.blocks.9.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:35:23,690.690 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.bias loaded from bert.encoder.blocks.9.mlp.fc2.bias of shape (768,) 2022-03-17 13:35:23,690.690 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.weight loaded from bert.encoder.blocks.9.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:35:23,690.690 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.bias loaded from bert.encoder.blocks.9.norm1.bias of shape (768,) 2022-03-17 13:35:23,690.690 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.weight loaded from bert.encoder.blocks.9.norm1.weight of shape (768,) 2022-03-17 13:35:23,691.691 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.bias loaded from bert.encoder.blocks.9.norm2.bias of shape (768,) 2022-03-17 13:35:23,691.691 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.weight loaded from bert.encoder.blocks.9.norm2.weight of shape (768,) 2022-03-17 13:35:23,691.691 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.bias loaded from bert.encoder.tag_blocks.0.attn.proj.bias of shape (768,) 2022-03-17 13:35:23,691.691 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.weight loaded from bert.encoder.tag_blocks.0.attn.proj.weight of shape (768, 768) 2022-03-17 13:35:23,691.691 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.bias loaded from bert.encoder.tag_blocks.0.attn.qkv.bias of shape (2304,) 2022-03-17 13:35:23,691.691 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.weight loaded from bert.encoder.tag_blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:35:23,691.691 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.bias loaded from bert.encoder.tag_blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-17 13:35:23,691.691 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.weight loaded from bert.encoder.tag_blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:35:23,691.691 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.bias loaded from bert.encoder.tag_blocks.0.mlp.fc2.bias of shape (768,) 2022-03-17 13:35:23,691.691 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.weight loaded from bert.encoder.tag_blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:35:23,691.691 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.bias loaded from bert.encoder.tag_blocks.0.norm1.bias of shape (768,) 2022-03-17 13:35:23,691.691 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.weight loaded from bert.encoder.tag_blocks.0.norm1.weight of shape (768,) 2022-03-17 13:35:23,691.691 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.bias loaded from bert.encoder.tag_blocks.0.norm2.bias of shape (768,) 2022-03-17 13:35:23,691.691 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.weight loaded from bert.encoder.tag_blocks.0.norm2.weight of shape (768,) 2022-03-17 13:35:23,692.692 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.bias loaded from bert.encoder.tag_blocks.1.attn.proj.bias of shape (768,) 2022-03-17 13:35:23,692.692 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.weight loaded from bert.encoder.tag_blocks.1.attn.proj.weight of shape (768, 768) 2022-03-17 13:35:23,692.692 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.bias loaded from bert.encoder.tag_blocks.1.attn.qkv.bias of shape (2304,) 2022-03-17 13:35:23,692.692 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.weight loaded from bert.encoder.tag_blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:35:23,692.692 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.bias loaded from bert.encoder.tag_blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-17 13:35:23,692.692 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.weight loaded from bert.encoder.tag_blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:35:23,692.692 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.bias loaded from bert.encoder.tag_blocks.1.mlp.fc2.bias of shape (768,) 2022-03-17 13:35:23,692.692 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.weight loaded from bert.encoder.tag_blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:35:23,692.692 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.bias loaded from bert.encoder.tag_blocks.1.norm1.bias of shape (768,) 2022-03-17 13:35:23,692.692 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.weight loaded from bert.encoder.tag_blocks.1.norm1.weight of shape (768,) 2022-03-17 13:35:23,692.692 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.bias loaded from bert.encoder.tag_blocks.1.norm2.bias of shape (768,) 2022-03-17 13:35:23,692.692 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.weight loaded from bert.encoder.tag_blocks.1.norm2.weight of shape (768,) 2022-03-17 13:35:23,692.692 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.bias loaded from bert.encoder.tag_blocks.2.attn.proj.bias of shape (768,) 2022-03-17 13:35:23,693.693 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.weight loaded from bert.encoder.tag_blocks.2.attn.proj.weight of shape (768, 768) 2022-03-17 13:35:23,693.693 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.bias loaded from bert.encoder.tag_blocks.2.attn.qkv.bias of shape (2304,) 2022-03-17 13:35:23,693.693 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.weight loaded from bert.encoder.tag_blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:35:23,693.693 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.bias loaded from bert.encoder.tag_blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-17 13:35:23,693.693 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.weight loaded from bert.encoder.tag_blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:35:23,693.693 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.bias loaded from bert.encoder.tag_blocks.2.mlp.fc2.bias of shape (768,) 2022-03-17 13:35:23,693.693 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.weight loaded from bert.encoder.tag_blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:35:23,693.693 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.bias loaded from bert.encoder.tag_blocks.2.norm1.bias of shape (768,) 2022-03-17 13:35:23,693.693 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.weight loaded from bert.encoder.tag_blocks.2.norm1.weight of shape (768,) 2022-03-17 13:35:23,693.693 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.bias loaded from bert.encoder.tag_blocks.2.norm2.bias of shape (768,) 2022-03-17 13:35:23,693.693 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.weight loaded from bert.encoder.tag_blocks.2.norm2.weight of shape (768,) 2022-03-17 13:35:23,693.693 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.bias loaded from bert.encoder.tag_blocks.3.attn.proj.bias of shape (768,) 2022-03-17 13:35:23,693.693 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.weight loaded from bert.encoder.tag_blocks.3.attn.proj.weight of shape (768, 768) 2022-03-17 13:35:23,693.693 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.bias loaded from bert.encoder.tag_blocks.3.attn.qkv.bias of shape (2304,) 2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.weight loaded from bert.encoder.tag_blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.bias loaded from bert.encoder.tag_blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.weight loaded from bert.encoder.tag_blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.bias loaded from bert.encoder.tag_blocks.3.mlp.fc2.bias of shape (768,) 2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.weight loaded from bert.encoder.tag_blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.bias loaded from bert.encoder.tag_blocks.3.norm1.bias of shape (768,) 2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.weight loaded from bert.encoder.tag_blocks.3.norm1.weight of shape (768,) 2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.bias loaded from bert.encoder.tag_blocks.3.norm2.bias of shape (768,) 2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.weight loaded from bert.encoder.tag_blocks.3.norm2.weight of shape (768,) 2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.bias loaded from bert.extra_embeddings.LayerNorm.bias of shape (768,) 2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.weight loaded from bert.extra_embeddings.LayerNorm.weight of shape (768,) 2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.position_embeddings.weight loaded from bert.extra_embeddings.position_embeddings.weight of shape (512, 768) 2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.token_type_embeddings.weight loaded from bert.extra_embeddings.token_type_embeddings.weight of shape (2, 768) 2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.word_embeddings.weight loaded from bert.extra_embeddings.word_embeddings.weight of shape (30522, 768) 2022-03-17 13:35:23,694.694 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.bias loaded from bert.pooler.dense.bias of shape (768,) 2022-03-17 13:35:23,695.695 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.weight loaded from bert.pooler.dense.weight of shape (768, 768) 2022-03-17 13:35:23,695.695 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.bias loaded from bert.tag_logit.predictions.bias of shape (30522,) 2022-03-17 13:35:23,695.695 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.decoder.weight loaded from bert.tag_logit.predictions.decoder.weight of shape (30522, 768) 2022-03-17 13:35:23,695.695 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.bias loaded from bert.tag_logit.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-17 13:35:23,695.695 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.weight loaded from bert.tag_logit.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-17 13:35:23,695.695 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.bias loaded from bert.tag_logit.predictions.transform.dense.bias of shape (768,) 2022-03-17 13:35:23,695.695 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.weight loaded from bert.tag_logit.predictions.transform.dense.weight of shape (768, 768) 2022-03-17 13:35:23,695.695 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.bias loaded from cls.predictions.bias of shape (30522,) 2022-03-17 13:35:23,695.695 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.decoder.weight loaded from cls.predictions.decoder.weight of shape (30522, 768) 2022-03-17 13:35:23,695.695 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.bias loaded from cls.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-17 13:35:23,695.695 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.weight loaded from cls.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-17 13:35:23,695.695 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.bias loaded from cls.predictions.transform.dense.bias of shape (768,) 2022-03-17 13:35:23,695.695 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.weight loaded from cls.predictions.transform.dense.weight of shape (768, 768) 2022-03-17 13:35:23,696.696 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 288; loaded = 288 2022-03-17 13:35:23,696.696 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = [] 2022-03-17 13:35:23,699.699 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = ['module.cls.predictions.decoder.weight']; total = 1 2022-03-17 13:35:23,857.857 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.cls.predictions.decoder.weight'] 2022-03-17 13:35:24,162.162 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = ; data = TaxCocoCaption; split = test; add_key = False; backend = cv; hold_buffer = 0; save_original = False 2022-03-17 13:35:24,188.188 2829:uni_pipeline.py:509 get_data_loader(): sampler = 2022-03-17 13:35:24,188.188 2829:uni_pipeline.py:1000 predict(): writing output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0005000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv 2022-03-17 13:35:24,262.262 2829:uni_pipeline.py:907 predict_iter(): DatasetPlusTransform(dataset=, transform=Compose( Compose( ImageTransform2Dict(image_transform=Compose( ToPILImage() Resize(size=384, interpolation=PIL.Image.BICUBIC) CenterCrop(size=(384, 384)) ToTensor() Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) )) ) LoadLabel(data=TaxCocoCaption, split=test, version=vinvl) TransCaptionTensorizer(tensorizer=, pad_to_max=True, pad_image_to_max=True) )) uni_pipeline.py:908: 0%| | 0/10 [00:00 locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi pip.monitor_train() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train need_wait_models = self.pred_eval_intermediate_models() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models pred = self.ensure_predict(model_file=model_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict self.predict(model_file, predict_result_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict model = self.get_model(is_train=False) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. --- Logging error --- Traceback (most recent call last): File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit msg = self.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format return fmt.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format record.message = record.getMessage() File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage msg = str(self.msg) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__ return str(self.to_json_string()) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n" File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps **kw).encode(obj) File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode chunks = list(chunks) File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode yield from _iterencode_dict(o, _current_indent_level) File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict yield from chunks File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode o = _default(o) File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default raise TypeError(f'Object of type {o.__class__.__name__} ' TypeError: Object of type BertTokenizer is not JSON serializable Call stack: File "src/qd/pipeline.py", line 1368, in locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi pip.monitor_train() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train need_wait_models = self.pred_eval_intermediate_models() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models pred = self.ensure_predict(model_file=model_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict self.predict(model_file, predict_result_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict model = self.get_model(is_train=False) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. 2022-03-17 13:37:44,284.284 2829:modeling_bert.py:529 __init__(): TIMM Split image encoder load from pre-trained: True 2022-03-17 13:37:45,355.355 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:37:46,911.911 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:37:47,559.559 2829:modeling_bert.py:2677 __init__(): BertImgModel Image Dimension: 2054 2022-03-17 13:37:47,707.707 2829:modeling_bert.py:2688 __init__(): BertImgModel Image Dimension: 2054 2022-03-17 13:37:48,499.499 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight! Note that this might be replaced by pre-trained checkpoint later! 2022-03-17 13:37:48,499.499 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode. 2022-03-17 13:37:49,635.635 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:37:50,179.179 2829:checkpoint.py:240 load(): Loading checkpoint from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt 2022-03-17 13:37:58,151.151 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.cls_token loaded from image_encoder.module.cls_token of shape (1, 1, 768) 2022-03-17 13:37:58,151.151 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.bias loaded from image_encoder.module.head.bias of shape (1000,) 2022-03-17 13:37:58,152.152 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.weight loaded from image_encoder.module.head.weight of shape (1000, 768) 2022-03-17 13:37:58,152.152 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.bias loaded from image_encoder.module.patch_embed.proj.bias of shape (768,) 2022-03-17 13:37:58,152.152 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.weight loaded from image_encoder.module.patch_embed.proj.weight of shape (768, 3, 16, 16) 2022-03-17 13:37:58,152.152 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.pos_embed loaded from image_encoder.module.pos_embed of shape (1, 577, 768) 2022-03-17 13:37:58,152.152 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.bias loaded from bert.caption_pooler.dense.bias of shape (768,) 2022-03-17 13:37:58,152.152 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.weight loaded from bert.caption_pooler.dense.weight of shape (768, 768) 2022-03-17 13:37:58,152.152 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.bias loaded from bert.decoder.layer.0.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:37:58,152.152 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.weight loaded from bert.decoder.layer.0.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:37:58,152.152 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.bias loaded from bert.decoder.layer.0.attention.output.dense.bias of shape (768,) 2022-03-17 13:37:58,152.152 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.weight loaded from bert.decoder.layer.0.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:37:58,152.152 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.bias loaded from bert.decoder.layer.0.attention.self.key.bias of shape (768,) 2022-03-17 13:37:58,152.152 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.weight loaded from bert.decoder.layer.0.attention.self.key.weight of shape (768, 768) 2022-03-17 13:37:58,153.153 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.bias loaded from bert.decoder.layer.0.attention.self.query.bias of shape (768,) 2022-03-17 13:37:58,153.153 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.weight loaded from bert.decoder.layer.0.attention.self.query.weight of shape (768, 768) 2022-03-17 13:37:58,153.153 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.bias loaded from bert.decoder.layer.0.attention.self.value.bias of shape (768,) 2022-03-17 13:37:58,153.153 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.weight loaded from bert.decoder.layer.0.attention.self.value.weight of shape (768, 768) 2022-03-17 13:37:58,153.153 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.bias loaded from bert.decoder.layer.0.intermediate.dense.bias of shape (3072,) 2022-03-17 13:37:58,153.153 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.weight loaded from bert.decoder.layer.0.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:37:58,153.153 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.bias loaded from bert.decoder.layer.0.output.LayerNorm.bias of shape (768,) 2022-03-17 13:37:58,153.153 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.weight loaded from bert.decoder.layer.0.output.LayerNorm.weight of shape (768,) 2022-03-17 13:37:58,153.153 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.bias loaded from bert.decoder.layer.0.output.dense.bias of shape (768,) 2022-03-17 13:37:58,153.153 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.weight loaded from bert.decoder.layer.0.output.dense.weight of shape (768, 3072) 2022-03-17 13:37:58,153.153 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.bias loaded from bert.decoder.layer.1.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:37:58,153.153 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.weight loaded from bert.decoder.layer.1.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:37:58,153.153 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.bias loaded from bert.decoder.layer.1.attention.output.dense.bias of shape (768,) 2022-03-17 13:37:58,154.154 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.weight loaded from bert.decoder.layer.1.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:37:58,154.154 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.bias loaded from bert.decoder.layer.1.attention.self.key.bias of shape (768,) 2022-03-17 13:37:58,154.154 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.weight loaded from bert.decoder.layer.1.attention.self.key.weight of shape (768, 768) 2022-03-17 13:37:58,154.154 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.bias loaded from bert.decoder.layer.1.attention.self.query.bias of shape (768,) 2022-03-17 13:37:58,154.154 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.weight loaded from bert.decoder.layer.1.attention.self.query.weight of shape (768, 768) 2022-03-17 13:37:58,154.154 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.bias loaded from bert.decoder.layer.1.attention.self.value.bias of shape (768,) 2022-03-17 13:37:58,154.154 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.weight loaded from bert.decoder.layer.1.attention.self.value.weight of shape (768, 768) 2022-03-17 13:37:58,154.154 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.bias loaded from bert.decoder.layer.1.intermediate.dense.bias of shape (3072,) 2022-03-17 13:37:58,154.154 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.weight loaded from bert.decoder.layer.1.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:37:58,154.154 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.bias loaded from bert.decoder.layer.1.output.LayerNorm.bias of shape (768,) 2022-03-17 13:37:58,154.154 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.weight loaded from bert.decoder.layer.1.output.LayerNorm.weight of shape (768,) 2022-03-17 13:37:58,154.154 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.bias loaded from bert.decoder.layer.1.output.dense.bias of shape (768,) 2022-03-17 13:37:58,154.154 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.weight loaded from bert.decoder.layer.1.output.dense.weight of shape (768, 3072) 2022-03-17 13:37:58,154.154 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.bias loaded from bert.decoder.layer.2.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:37:58,155.155 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.weight loaded from bert.decoder.layer.2.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:37:58,155.155 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.bias loaded from bert.decoder.layer.2.attention.output.dense.bias of shape (768,) 2022-03-17 13:37:58,155.155 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.weight loaded from bert.decoder.layer.2.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:37:58,155.155 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.bias loaded from bert.decoder.layer.2.attention.self.key.bias of shape (768,) 2022-03-17 13:37:58,155.155 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.weight loaded from bert.decoder.layer.2.attention.self.key.weight of shape (768, 768) 2022-03-17 13:37:58,155.155 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.bias loaded from bert.decoder.layer.2.attention.self.query.bias of shape (768,) 2022-03-17 13:37:58,155.155 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.weight loaded from bert.decoder.layer.2.attention.self.query.weight of shape (768, 768) 2022-03-17 13:37:58,155.155 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.bias loaded from bert.decoder.layer.2.attention.self.value.bias of shape (768,) 2022-03-17 13:37:58,155.155 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.weight loaded from bert.decoder.layer.2.attention.self.value.weight of shape (768, 768) 2022-03-17 13:37:58,155.155 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.bias loaded from bert.decoder.layer.2.intermediate.dense.bias of shape (3072,) 2022-03-17 13:37:58,155.155 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.weight loaded from bert.decoder.layer.2.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:37:58,155.155 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.bias loaded from bert.decoder.layer.2.output.LayerNorm.bias of shape (768,) 2022-03-17 13:37:58,155.155 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.weight loaded from bert.decoder.layer.2.output.LayerNorm.weight of shape (768,) 2022-03-17 13:37:58,156.156 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.bias loaded from bert.decoder.layer.2.output.dense.bias of shape (768,) 2022-03-17 13:37:58,156.156 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.weight loaded from bert.decoder.layer.2.output.dense.weight of shape (768, 3072) 2022-03-17 13:37:58,156.156 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.bias loaded from bert.decoder.layer.3.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:37:58,156.156 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.weight loaded from bert.decoder.layer.3.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:37:58,156.156 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.bias loaded from bert.decoder.layer.3.attention.output.dense.bias of shape (768,) 2022-03-17 13:37:58,156.156 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.weight loaded from bert.decoder.layer.3.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:37:58,156.156 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.bias loaded from bert.decoder.layer.3.attention.self.key.bias of shape (768,) 2022-03-17 13:37:58,156.156 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.weight loaded from bert.decoder.layer.3.attention.self.key.weight of shape (768, 768) 2022-03-17 13:37:58,156.156 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.bias loaded from bert.decoder.layer.3.attention.self.query.bias of shape (768,) 2022-03-17 13:37:58,156.156 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.weight loaded from bert.decoder.layer.3.attention.self.query.weight of shape (768, 768) 2022-03-17 13:37:58,157.157 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.bias loaded from bert.decoder.layer.3.attention.self.value.bias of shape (768,) 2022-03-17 13:37:58,157.157 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.weight loaded from bert.decoder.layer.3.attention.self.value.weight of shape (768, 768) 2022-03-17 13:37:58,157.157 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.bias loaded from bert.decoder.layer.3.intermediate.dense.bias of shape (3072,) 2022-03-17 13:37:58,157.157 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.weight loaded from bert.decoder.layer.3.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:37:58,157.157 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.bias loaded from bert.decoder.layer.3.output.LayerNorm.bias of shape (768,) 2022-03-17 13:37:58,157.157 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.weight loaded from bert.decoder.layer.3.output.LayerNorm.weight of shape (768,) 2022-03-17 13:37:58,157.157 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.bias loaded from bert.decoder.layer.3.output.dense.bias of shape (768,) 2022-03-17 13:37:58,157.157 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.weight loaded from bert.decoder.layer.3.output.dense.weight of shape (768, 3072) 2022-03-17 13:37:58,157.157 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.bias loaded from bert.embeddings.LayerNorm.bias of shape (768,) 2022-03-17 13:37:58,157.157 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.weight loaded from bert.embeddings.LayerNorm.weight of shape (768,) 2022-03-17 13:37:58,157.157 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.position_embeddings.weight loaded from bert.embeddings.position_embeddings.weight of shape (512, 768) 2022-03-17 13:37:58,157.157 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.token_type_embeddings.weight loaded from bert.embeddings.token_type_embeddings.weight of shape (2, 768) 2022-03-17 13:37:58,158.158 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.word_embeddings.weight loaded from bert.embeddings.word_embeddings.weight of shape (30522, 768) 2022-03-17 13:37:58,158.158 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.bias loaded from bert.encoder.blocks.0.attn.proj.bias of shape (768,) 2022-03-17 13:37:58,158.158 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.weight loaded from bert.encoder.blocks.0.attn.proj.weight of shape (768, 768) 2022-03-17 13:37:58,158.158 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.bias loaded from bert.encoder.blocks.0.attn.qkv.bias of shape (2304,) 2022-03-17 13:37:58,158.158 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.weight loaded from bert.encoder.blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:37:58,158.158 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.bias loaded from bert.encoder.blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-17 13:37:58,158.158 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.weight loaded from bert.encoder.blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:37:58,158.158 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.bias loaded from bert.encoder.blocks.0.mlp.fc2.bias of shape (768,) 2022-03-17 13:37:58,158.158 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.weight loaded from bert.encoder.blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:37:58,158.158 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.bias loaded from bert.encoder.blocks.0.norm1.bias of shape (768,) 2022-03-17 13:37:58,158.158 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.weight loaded from bert.encoder.blocks.0.norm1.weight of shape (768,) 2022-03-17 13:37:58,158.158 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.bias loaded from bert.encoder.blocks.0.norm2.bias of shape (768,) 2022-03-17 13:37:58,158.158 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.weight loaded from bert.encoder.blocks.0.norm2.weight of shape (768,) 2022-03-17 13:37:58,158.158 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.bias loaded from bert.encoder.blocks.1.attn.proj.bias of shape (768,) 2022-03-17 13:37:58,159.159 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.weight loaded from bert.encoder.blocks.1.attn.proj.weight of shape (768, 768) 2022-03-17 13:37:58,159.159 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.bias loaded from bert.encoder.blocks.1.attn.qkv.bias of shape (2304,) 2022-03-17 13:37:58,159.159 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.weight loaded from bert.encoder.blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:37:58,159.159 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.bias loaded from bert.encoder.blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-17 13:37:58,159.159 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.weight loaded from bert.encoder.blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:37:58,159.159 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.bias loaded from bert.encoder.blocks.1.mlp.fc2.bias of shape (768,) 2022-03-17 13:37:58,159.159 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.weight loaded from bert.encoder.blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:37:58,159.159 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.bias loaded from bert.encoder.blocks.1.norm1.bias of shape (768,) 2022-03-17 13:37:58,159.159 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.weight loaded from bert.encoder.blocks.1.norm1.weight of shape (768,) 2022-03-17 13:37:58,159.159 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.bias loaded from bert.encoder.blocks.1.norm2.bias of shape (768,) 2022-03-17 13:37:58,159.159 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.weight loaded from bert.encoder.blocks.1.norm2.weight of shape (768,) 2022-03-17 13:37:58,159.159 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.bias loaded from bert.encoder.blocks.10.attn.proj.bias of shape (768,) 2022-03-17 13:37:58,159.159 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.weight loaded from bert.encoder.blocks.10.attn.proj.weight of shape (768, 768) 2022-03-17 13:37:58,160.160 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.bias loaded from bert.encoder.blocks.10.attn.qkv.bias of shape (2304,) 2022-03-17 13:37:58,160.160 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.weight loaded from bert.encoder.blocks.10.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:37:58,160.160 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.bias loaded from bert.encoder.blocks.10.mlp.fc1.bias of shape (3072,) 2022-03-17 13:37:58,160.160 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.weight loaded from bert.encoder.blocks.10.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:37:58,160.160 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.bias loaded from bert.encoder.blocks.10.mlp.fc2.bias of shape (768,) 2022-03-17 13:37:58,160.160 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.weight loaded from bert.encoder.blocks.10.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:37:58,160.160 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.bias loaded from bert.encoder.blocks.10.norm1.bias of shape (768,) 2022-03-17 13:37:58,160.160 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.weight loaded from bert.encoder.blocks.10.norm1.weight of shape (768,) 2022-03-17 13:37:58,160.160 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.bias loaded from bert.encoder.blocks.10.norm2.bias of shape (768,) 2022-03-17 13:37:58,160.160 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.weight loaded from bert.encoder.blocks.10.norm2.weight of shape (768,) 2022-03-17 13:37:58,160.160 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.bias loaded from bert.encoder.blocks.11.attn.proj.bias of shape (768,) 2022-03-17 13:37:58,160.160 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.weight loaded from bert.encoder.blocks.11.attn.proj.weight of shape (768, 768) 2022-03-17 13:37:58,160.160 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.bias loaded from bert.encoder.blocks.11.attn.qkv.bias of shape (2304,) 2022-03-17 13:37:58,161.161 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.weight loaded from bert.encoder.blocks.11.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:37:58,161.161 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.bias loaded from bert.encoder.blocks.11.mlp.fc1.bias of shape (3072,) 2022-03-17 13:37:58,161.161 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.weight loaded from bert.encoder.blocks.11.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:37:58,161.161 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.bias loaded from bert.encoder.blocks.11.mlp.fc2.bias of shape (768,) 2022-03-17 13:37:58,161.161 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.weight loaded from bert.encoder.blocks.11.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:37:58,161.161 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.bias loaded from bert.encoder.blocks.11.norm1.bias of shape (768,) 2022-03-17 13:37:58,161.161 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.weight loaded from bert.encoder.blocks.11.norm1.weight of shape (768,) 2022-03-17 13:37:58,161.161 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.bias loaded from bert.encoder.blocks.11.norm2.bias of shape (768,) 2022-03-17 13:37:58,161.161 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.weight loaded from bert.encoder.blocks.11.norm2.weight of shape (768,) 2022-03-17 13:37:58,161.161 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.bias loaded from bert.encoder.blocks.2.attn.proj.bias of shape (768,) 2022-03-17 13:37:58,161.161 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.weight loaded from bert.encoder.blocks.2.attn.proj.weight of shape (768, 768) 2022-03-17 13:37:58,161.161 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.bias loaded from bert.encoder.blocks.2.attn.qkv.bias of shape (2304,) 2022-03-17 13:37:58,161.161 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.weight loaded from bert.encoder.blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:37:58,162.162 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.bias loaded from bert.encoder.blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-17 13:37:58,162.162 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.weight loaded from bert.encoder.blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:37:58,162.162 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.bias loaded from bert.encoder.blocks.2.mlp.fc2.bias of shape (768,) 2022-03-17 13:37:58,162.162 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.weight loaded from bert.encoder.blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:37:58,162.162 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.bias loaded from bert.encoder.blocks.2.norm1.bias of shape (768,) 2022-03-17 13:37:58,162.162 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.weight loaded from bert.encoder.blocks.2.norm1.weight of shape (768,) 2022-03-17 13:37:58,162.162 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.bias loaded from bert.encoder.blocks.2.norm2.bias of shape (768,) 2022-03-17 13:37:58,162.162 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.weight loaded from bert.encoder.blocks.2.norm2.weight of shape (768,) 2022-03-17 13:37:58,162.162 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.bias loaded from bert.encoder.blocks.3.attn.proj.bias of shape (768,) 2022-03-17 13:37:58,162.162 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.weight loaded from bert.encoder.blocks.3.attn.proj.weight of shape (768, 768) 2022-03-17 13:37:58,162.162 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.bias loaded from bert.encoder.blocks.3.attn.qkv.bias of shape (2304,) 2022-03-17 13:37:58,162.162 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.weight loaded from bert.encoder.blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:37:58,162.162 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.bias loaded from bert.encoder.blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-17 13:37:58,162.162 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.weight loaded from bert.encoder.blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:37:58,163.163 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.bias loaded from bert.encoder.blocks.3.mlp.fc2.bias of shape (768,) 2022-03-17 13:37:58,163.163 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.weight loaded from bert.encoder.blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:37:58,163.163 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.bias loaded from bert.encoder.blocks.3.norm1.bias of shape (768,) 2022-03-17 13:37:58,163.163 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.weight loaded from bert.encoder.blocks.3.norm1.weight of shape (768,) 2022-03-17 13:37:58,163.163 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.bias loaded from bert.encoder.blocks.3.norm2.bias of shape (768,) 2022-03-17 13:37:58,163.163 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.weight loaded from bert.encoder.blocks.3.norm2.weight of shape (768,) 2022-03-17 13:37:58,163.163 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.bias loaded from bert.encoder.blocks.4.attn.proj.bias of shape (768,) 2022-03-17 13:37:58,163.163 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.weight loaded from bert.encoder.blocks.4.attn.proj.weight of shape (768, 768) 2022-03-17 13:37:58,163.163 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.bias loaded from bert.encoder.blocks.4.attn.qkv.bias of shape (2304,) 2022-03-17 13:37:58,163.163 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.weight loaded from bert.encoder.blocks.4.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:37:58,163.163 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.bias loaded from bert.encoder.blocks.4.mlp.fc1.bias of shape (3072,) 2022-03-17 13:37:58,163.163 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.weight loaded from bert.encoder.blocks.4.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:37:58,163.163 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.bias loaded from bert.encoder.blocks.4.mlp.fc2.bias of shape (768,) 2022-03-17 13:37:58,163.163 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.weight loaded from bert.encoder.blocks.4.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:37:58,164.164 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.bias loaded from bert.encoder.blocks.4.norm1.bias of shape (768,) 2022-03-17 13:37:58,164.164 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.weight loaded from bert.encoder.blocks.4.norm1.weight of shape (768,) 2022-03-17 13:37:58,164.164 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.bias loaded from bert.encoder.blocks.4.norm2.bias of shape (768,) 2022-03-17 13:37:58,164.164 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.weight loaded from bert.encoder.blocks.4.norm2.weight of shape (768,) 2022-03-17 13:37:58,164.164 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.bias loaded from bert.encoder.blocks.5.attn.proj.bias of shape (768,) 2022-03-17 13:37:58,164.164 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.weight loaded from bert.encoder.blocks.5.attn.proj.weight of shape (768, 768) 2022-03-17 13:37:58,164.164 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.bias loaded from bert.encoder.blocks.5.attn.qkv.bias of shape (2304,) 2022-03-17 13:37:58,164.164 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.weight loaded from bert.encoder.blocks.5.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:37:58,164.164 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.bias loaded from bert.encoder.blocks.5.mlp.fc1.bias of shape (3072,) 2022-03-17 13:37:58,164.164 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.weight loaded from bert.encoder.blocks.5.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:37:58,164.164 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.bias loaded from bert.encoder.blocks.5.mlp.fc2.bias of shape (768,) 2022-03-17 13:37:58,164.164 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.weight loaded from bert.encoder.blocks.5.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:37:58,164.164 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.bias loaded from bert.encoder.blocks.5.norm1.bias of shape (768,) 2022-03-17 13:37:58,165.165 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.weight loaded from bert.encoder.blocks.5.norm1.weight of shape (768,) 2022-03-17 13:37:58,165.165 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.bias loaded from bert.encoder.blocks.5.norm2.bias of shape (768,) 2022-03-17 13:37:58,165.165 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.weight loaded from bert.encoder.blocks.5.norm2.weight of shape (768,) 2022-03-17 13:37:58,165.165 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.bias loaded from bert.encoder.blocks.6.attn.proj.bias of shape (768,) 2022-03-17 13:37:58,165.165 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.weight loaded from bert.encoder.blocks.6.attn.proj.weight of shape (768, 768) 2022-03-17 13:37:58,165.165 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.bias loaded from bert.encoder.blocks.6.attn.qkv.bias of shape (2304,) 2022-03-17 13:37:58,165.165 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.weight loaded from bert.encoder.blocks.6.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:37:58,165.165 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.bias loaded from bert.encoder.blocks.6.mlp.fc1.bias of shape (3072,) 2022-03-17 13:37:58,165.165 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.weight loaded from bert.encoder.blocks.6.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:37:58,165.165 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.bias loaded from bert.encoder.blocks.6.mlp.fc2.bias of shape (768,) 2022-03-17 13:37:58,165.165 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.weight loaded from bert.encoder.blocks.6.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:37:58,165.165 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.bias loaded from bert.encoder.blocks.6.norm1.bias of shape (768,) 2022-03-17 13:37:58,165.165 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.weight loaded from bert.encoder.blocks.6.norm1.weight of shape (768,) 2022-03-17 13:37:58,166.166 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.bias loaded from bert.encoder.blocks.6.norm2.bias of shape (768,) 2022-03-17 13:37:58,166.166 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.weight loaded from bert.encoder.blocks.6.norm2.weight of shape (768,) 2022-03-17 13:37:58,166.166 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.bias loaded from bert.encoder.blocks.7.attn.proj.bias of shape (768,) 2022-03-17 13:37:58,166.166 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.weight loaded from bert.encoder.blocks.7.attn.proj.weight of shape (768, 768) 2022-03-17 13:37:58,166.166 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.bias loaded from bert.encoder.blocks.7.attn.qkv.bias of shape (2304,) 2022-03-17 13:37:58,166.166 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.weight loaded from bert.encoder.blocks.7.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:37:58,166.166 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.bias loaded from bert.encoder.blocks.7.mlp.fc1.bias of shape (3072,) 2022-03-17 13:37:58,166.166 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.weight loaded from bert.encoder.blocks.7.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:37:58,166.166 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.bias loaded from bert.encoder.blocks.7.mlp.fc2.bias of shape (768,) 2022-03-17 13:37:58,166.166 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.weight loaded from bert.encoder.blocks.7.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:37:58,166.166 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.bias loaded from bert.encoder.blocks.7.norm1.bias of shape (768,) 2022-03-17 13:37:58,166.166 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.weight loaded from bert.encoder.blocks.7.norm1.weight of shape (768,) 2022-03-17 13:37:58,166.166 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.bias loaded from bert.encoder.blocks.7.norm2.bias of shape (768,) 2022-03-17 13:37:58,166.166 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.weight loaded from bert.encoder.blocks.7.norm2.weight of shape (768,) 2022-03-17 13:37:58,167.167 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.bias loaded from bert.encoder.blocks.8.attn.proj.bias of shape (768,) 2022-03-17 13:37:58,167.167 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.weight loaded from bert.encoder.blocks.8.attn.proj.weight of shape (768, 768) 2022-03-17 13:37:58,167.167 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.bias loaded from bert.encoder.blocks.8.attn.qkv.bias of shape (2304,) 2022-03-17 13:37:58,167.167 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.weight loaded from bert.encoder.blocks.8.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:37:58,167.167 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.bias loaded from bert.encoder.blocks.8.mlp.fc1.bias of shape (3072,) 2022-03-17 13:37:58,167.167 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.weight loaded from bert.encoder.blocks.8.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:37:58,167.167 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.bias loaded from bert.encoder.blocks.8.mlp.fc2.bias of shape (768,) 2022-03-17 13:37:58,167.167 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.weight loaded from bert.encoder.blocks.8.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:37:58,167.167 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.bias loaded from bert.encoder.blocks.8.norm1.bias of shape (768,) 2022-03-17 13:37:58,167.167 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.weight loaded from bert.encoder.blocks.8.norm1.weight of shape (768,) 2022-03-17 13:37:58,167.167 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.bias loaded from bert.encoder.blocks.8.norm2.bias of shape (768,) 2022-03-17 13:37:58,167.167 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.weight loaded from bert.encoder.blocks.8.norm2.weight of shape (768,) 2022-03-17 13:37:58,167.167 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.bias loaded from bert.encoder.blocks.9.attn.proj.bias of shape (768,) 2022-03-17 13:37:58,167.167 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.weight loaded from bert.encoder.blocks.9.attn.proj.weight of shape (768, 768) 2022-03-17 13:37:58,168.168 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.bias loaded from bert.encoder.blocks.9.attn.qkv.bias of shape (2304,) 2022-03-17 13:37:58,168.168 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.weight loaded from bert.encoder.blocks.9.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:37:58,168.168 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.bias loaded from bert.encoder.blocks.9.mlp.fc1.bias of shape (3072,) 2022-03-17 13:37:58,168.168 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.weight loaded from bert.encoder.blocks.9.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:37:58,168.168 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.bias loaded from bert.encoder.blocks.9.mlp.fc2.bias of shape (768,) 2022-03-17 13:37:58,168.168 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.weight loaded from bert.encoder.blocks.9.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:37:58,168.168 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.bias loaded from bert.encoder.blocks.9.norm1.bias of shape (768,) 2022-03-17 13:37:58,168.168 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.weight loaded from bert.encoder.blocks.9.norm1.weight of shape (768,) 2022-03-17 13:37:58,168.168 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.bias loaded from bert.encoder.blocks.9.norm2.bias of shape (768,) 2022-03-17 13:37:58,168.168 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.weight loaded from bert.encoder.blocks.9.norm2.weight of shape (768,) 2022-03-17 13:37:58,168.168 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.bias loaded from bert.encoder.tag_blocks.0.attn.proj.bias of shape (768,) 2022-03-17 13:37:58,168.168 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.weight loaded from bert.encoder.tag_blocks.0.attn.proj.weight of shape (768, 768) 2022-03-17 13:37:58,168.168 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.bias loaded from bert.encoder.tag_blocks.0.attn.qkv.bias of shape (2304,) 2022-03-17 13:37:58,169.169 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.weight loaded from bert.encoder.tag_blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:37:58,169.169 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.bias loaded from bert.encoder.tag_blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-17 13:37:58,169.169 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.weight loaded from bert.encoder.tag_blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:37:58,169.169 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.bias loaded from bert.encoder.tag_blocks.0.mlp.fc2.bias of shape (768,) 2022-03-17 13:37:58,169.169 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.weight loaded from bert.encoder.tag_blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:37:58,169.169 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.bias loaded from bert.encoder.tag_blocks.0.norm1.bias of shape (768,) 2022-03-17 13:37:58,169.169 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.weight loaded from bert.encoder.tag_blocks.0.norm1.weight of shape (768,) 2022-03-17 13:37:58,169.169 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.bias loaded from bert.encoder.tag_blocks.0.norm2.bias of shape (768,) 2022-03-17 13:37:58,169.169 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.weight loaded from bert.encoder.tag_blocks.0.norm2.weight of shape (768,) 2022-03-17 13:37:58,169.169 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.bias loaded from bert.encoder.tag_blocks.1.attn.proj.bias of shape (768,) 2022-03-17 13:37:58,169.169 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.weight loaded from bert.encoder.tag_blocks.1.attn.proj.weight of shape (768, 768) 2022-03-17 13:37:58,169.169 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.bias loaded from bert.encoder.tag_blocks.1.attn.qkv.bias of shape (2304,) 2022-03-17 13:37:58,169.169 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.weight loaded from bert.encoder.tag_blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:37:58,169.169 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.bias loaded from bert.encoder.tag_blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-17 13:37:58,170.170 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.weight loaded from bert.encoder.tag_blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:37:58,170.170 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.bias loaded from bert.encoder.tag_blocks.1.mlp.fc2.bias of shape (768,) 2022-03-17 13:37:58,170.170 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.weight loaded from bert.encoder.tag_blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:37:58,170.170 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.bias loaded from bert.encoder.tag_blocks.1.norm1.bias of shape (768,) 2022-03-17 13:37:58,170.170 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.weight loaded from bert.encoder.tag_blocks.1.norm1.weight of shape (768,) 2022-03-17 13:37:58,170.170 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.bias loaded from bert.encoder.tag_blocks.1.norm2.bias of shape (768,) 2022-03-17 13:37:58,170.170 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.weight loaded from bert.encoder.tag_blocks.1.norm2.weight of shape (768,) 2022-03-17 13:37:58,170.170 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.bias loaded from bert.encoder.tag_blocks.2.attn.proj.bias of shape (768,) 2022-03-17 13:37:58,170.170 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.weight loaded from bert.encoder.tag_blocks.2.attn.proj.weight of shape (768, 768) 2022-03-17 13:37:58,170.170 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.bias loaded from bert.encoder.tag_blocks.2.attn.qkv.bias of shape (2304,) 2022-03-17 13:37:58,170.170 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.weight loaded from bert.encoder.tag_blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:37:58,170.170 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.bias loaded from bert.encoder.tag_blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-17 13:37:58,170.170 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.weight loaded from bert.encoder.tag_blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:37:58,171.171 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.bias loaded from bert.encoder.tag_blocks.2.mlp.fc2.bias of shape (768,) 2022-03-17 13:37:58,171.171 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.weight loaded from bert.encoder.tag_blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:37:58,171.171 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.bias loaded from bert.encoder.tag_blocks.2.norm1.bias of shape (768,) 2022-03-17 13:37:58,171.171 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.weight loaded from bert.encoder.tag_blocks.2.norm1.weight of shape (768,) 2022-03-17 13:37:58,171.171 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.bias loaded from bert.encoder.tag_blocks.2.norm2.bias of shape (768,) 2022-03-17 13:37:58,171.171 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.weight loaded from bert.encoder.tag_blocks.2.norm2.weight of shape (768,) 2022-03-17 13:37:58,171.171 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.bias loaded from bert.encoder.tag_blocks.3.attn.proj.bias of shape (768,) 2022-03-17 13:37:58,171.171 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.weight loaded from bert.encoder.tag_blocks.3.attn.proj.weight of shape (768, 768) 2022-03-17 13:37:58,171.171 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.bias loaded from bert.encoder.tag_blocks.3.attn.qkv.bias of shape (2304,) 2022-03-17 13:37:58,171.171 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.weight loaded from bert.encoder.tag_blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:37:58,171.171 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.bias loaded from bert.encoder.tag_blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-17 13:37:58,171.171 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.weight loaded from bert.encoder.tag_blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:37:58,171.171 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.bias loaded from bert.encoder.tag_blocks.3.mlp.fc2.bias of shape (768,) 2022-03-17 13:37:58,171.171 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.weight loaded from bert.encoder.tag_blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:37:58,172.172 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.bias loaded from bert.encoder.tag_blocks.3.norm1.bias of shape (768,) 2022-03-17 13:37:58,172.172 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.weight loaded from bert.encoder.tag_blocks.3.norm1.weight of shape (768,) 2022-03-17 13:37:58,172.172 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.bias loaded from bert.encoder.tag_blocks.3.norm2.bias of shape (768,) 2022-03-17 13:37:58,172.172 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.weight loaded from bert.encoder.tag_blocks.3.norm2.weight of shape (768,) 2022-03-17 13:37:58,172.172 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.bias loaded from bert.extra_embeddings.LayerNorm.bias of shape (768,) 2022-03-17 13:37:58,172.172 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.weight loaded from bert.extra_embeddings.LayerNorm.weight of shape (768,) 2022-03-17 13:37:58,172.172 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.position_embeddings.weight loaded from bert.extra_embeddings.position_embeddings.weight of shape (512, 768) 2022-03-17 13:37:58,172.172 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.token_type_embeddings.weight loaded from bert.extra_embeddings.token_type_embeddings.weight of shape (2, 768) 2022-03-17 13:37:58,172.172 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.word_embeddings.weight loaded from bert.extra_embeddings.word_embeddings.weight of shape (30522, 768) 2022-03-17 13:37:58,172.172 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.bias loaded from bert.pooler.dense.bias of shape (768,) 2022-03-17 13:37:58,172.172 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.weight loaded from bert.pooler.dense.weight of shape (768, 768) 2022-03-17 13:37:58,172.172 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.bias loaded from bert.tag_logit.predictions.bias of shape (30522,) 2022-03-17 13:37:58,172.172 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.decoder.weight loaded from bert.tag_logit.predictions.decoder.weight of shape (30522, 768) 2022-03-17 13:37:58,173.173 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.bias loaded from bert.tag_logit.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-17 13:37:58,173.173 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.weight loaded from bert.tag_logit.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-17 13:37:58,173.173 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.bias loaded from bert.tag_logit.predictions.transform.dense.bias of shape (768,) 2022-03-17 13:37:58,173.173 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.weight loaded from bert.tag_logit.predictions.transform.dense.weight of shape (768, 768) 2022-03-17 13:37:58,173.173 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.bias loaded from cls.predictions.bias of shape (30522,) 2022-03-17 13:37:58,173.173 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.decoder.weight loaded from cls.predictions.decoder.weight of shape (30522, 768) 2022-03-17 13:37:58,173.173 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.bias loaded from cls.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-17 13:37:58,173.173 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.weight loaded from cls.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-17 13:37:58,173.173 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.bias loaded from cls.predictions.transform.dense.bias of shape (768,) 2022-03-17 13:37:58,173.173 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.weight loaded from cls.predictions.transform.dense.weight of shape (768, 768) 2022-03-17 13:37:58,173.173 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 288; loaded = 288 2022-03-17 13:37:58,173.173 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = [] 2022-03-17 13:37:58,177.177 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = ['module.cls.predictions.decoder.weight']; total = 1 2022-03-17 13:37:58,340.340 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.cls.predictions.decoder.weight'] 2022-03-17 13:37:58,669.669 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = ; data = TaxCocoCaption; split = test; add_key = False; backend = cv; hold_buffer = 0; save_original = False 2022-03-17 13:37:58,695.695 2829:uni_pipeline.py:509 get_data_loader(): sampler = 2022-03-17 13:37:58,696.696 2829:uni_pipeline.py:1000 predict(): writing output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0010000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv 2022-03-17 13:37:58,768.768 2829:uni_pipeline.py:907 predict_iter(): DatasetPlusTransform(dataset=, transform=Compose( Compose( ImageTransform2Dict(image_transform=Compose( ToPILImage() Resize(size=384, interpolation=PIL.Image.BICUBIC) CenterCrop(size=(384, 384)) ToTensor() Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) )) ) LoadLabel(data=TaxCocoCaption, split=test, version=vinvl) TransCaptionTensorizer(tensorizer=, pad_to_max=True, pad_image_to_max=True) )) uni_pipeline.py:908: 0%| | 0/10 [00:00 locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi pip.monitor_train() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train need_wait_models = self.pred_eval_intermediate_models() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models pred = self.ensure_predict(model_file=model_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict self.predict(model_file, predict_result_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict model = self.get_model(is_train=False) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. --- Logging error --- Traceback (most recent call last): File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit msg = self.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format return fmt.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format record.message = record.getMessage() File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage msg = str(self.msg) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__ return str(self.to_json_string()) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n" File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps **kw).encode(obj) File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode chunks = list(chunks) File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode yield from _iterencode_dict(o, _current_indent_level) File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict yield from chunks File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode o = _default(o) File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default raise TypeError(f'Object of type {o.__class__.__name__} ' TypeError: Object of type BertTokenizer is not JSON serializable Call stack: File "src/qd/pipeline.py", line 1368, in locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi pip.monitor_train() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train need_wait_models = self.pred_eval_intermediate_models() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models pred = self.ensure_predict(model_file=model_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict self.predict(model_file, predict_result_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict model = self.get_model(is_train=False) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. 2022-03-17 13:40:08,351.351 2829:modeling_bert.py:529 __init__(): TIMM Split image encoder load from pre-trained: True 2022-03-17 13:40:09,415.415 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:40:10,941.941 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:40:11,597.597 2829:modeling_bert.py:2677 __init__(): BertImgModel Image Dimension: 2054 2022-03-17 13:40:11,745.745 2829:modeling_bert.py:2688 __init__(): BertImgModel Image Dimension: 2054 2022-03-17 13:40:12,537.537 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight! Note that this might be replaced by pre-trained checkpoint later! 2022-03-17 13:40:12,537.537 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode. 2022-03-17 13:40:13,674.674 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:40:14,218.218 2829:checkpoint.py:240 load(): Loading checkpoint from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt 2022-03-17 13:40:22,393.393 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.cls_token loaded from image_encoder.module.cls_token of shape (1, 1, 768) 2022-03-17 13:40:22,393.393 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.bias loaded from image_encoder.module.head.bias of shape (1000,) 2022-03-17 13:40:22,393.393 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.weight loaded from image_encoder.module.head.weight of shape (1000, 768) 2022-03-17 13:40:22,393.393 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.bias loaded from image_encoder.module.patch_embed.proj.bias of shape (768,) 2022-03-17 13:40:22,393.393 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.weight loaded from image_encoder.module.patch_embed.proj.weight of shape (768, 3, 16, 16) 2022-03-17 13:40:22,394.394 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.pos_embed loaded from image_encoder.module.pos_embed of shape (1, 577, 768) 2022-03-17 13:40:22,394.394 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.bias loaded from bert.caption_pooler.dense.bias of shape (768,) 2022-03-17 13:40:22,394.394 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.weight loaded from bert.caption_pooler.dense.weight of shape (768, 768) 2022-03-17 13:40:22,394.394 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.bias loaded from bert.decoder.layer.0.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:40:22,394.394 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.weight loaded from bert.decoder.layer.0.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:40:22,394.394 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.bias loaded from bert.decoder.layer.0.attention.output.dense.bias of shape (768,) 2022-03-17 13:40:22,394.394 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.weight loaded from bert.decoder.layer.0.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:40:22,394.394 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.bias loaded from bert.decoder.layer.0.attention.self.key.bias of shape (768,) 2022-03-17 13:40:22,394.394 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.weight loaded from bert.decoder.layer.0.attention.self.key.weight of shape (768, 768) 2022-03-17 13:40:22,394.394 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.bias loaded from bert.decoder.layer.0.attention.self.query.bias of shape (768,) 2022-03-17 13:40:22,394.394 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.weight loaded from bert.decoder.layer.0.attention.self.query.weight of shape (768, 768) 2022-03-17 13:40:22,395.395 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.bias loaded from bert.decoder.layer.0.attention.self.value.bias of shape (768,) 2022-03-17 13:40:22,395.395 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.weight loaded from bert.decoder.layer.0.attention.self.value.weight of shape (768, 768) 2022-03-17 13:40:22,395.395 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.bias loaded from bert.decoder.layer.0.intermediate.dense.bias of shape (3072,) 2022-03-17 13:40:22,395.395 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.weight loaded from bert.decoder.layer.0.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:40:22,395.395 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.bias loaded from bert.decoder.layer.0.output.LayerNorm.bias of shape (768,) 2022-03-17 13:40:22,395.395 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.weight loaded from bert.decoder.layer.0.output.LayerNorm.weight of shape (768,) 2022-03-17 13:40:22,395.395 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.bias loaded from bert.decoder.layer.0.output.dense.bias of shape (768,) 2022-03-17 13:40:22,395.395 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.weight loaded from bert.decoder.layer.0.output.dense.weight of shape (768, 3072) 2022-03-17 13:40:22,395.395 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.bias loaded from bert.decoder.layer.1.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:40:22,395.395 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.weight loaded from bert.decoder.layer.1.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:40:22,395.395 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.bias loaded from bert.decoder.layer.1.attention.output.dense.bias of shape (768,) 2022-03-17 13:40:22,395.395 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.weight loaded from bert.decoder.layer.1.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:40:22,395.395 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.bias loaded from bert.decoder.layer.1.attention.self.key.bias of shape (768,) 2022-03-17 13:40:22,396.396 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.weight loaded from bert.decoder.layer.1.attention.self.key.weight of shape (768, 768) 2022-03-17 13:40:22,396.396 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.bias loaded from bert.decoder.layer.1.attention.self.query.bias of shape (768,) 2022-03-17 13:40:22,396.396 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.weight loaded from bert.decoder.layer.1.attention.self.query.weight of shape (768, 768) 2022-03-17 13:40:22,396.396 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.bias loaded from bert.decoder.layer.1.attention.self.value.bias of shape (768,) 2022-03-17 13:40:22,396.396 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.weight loaded from bert.decoder.layer.1.attention.self.value.weight of shape (768, 768) 2022-03-17 13:40:22,396.396 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.bias loaded from bert.decoder.layer.1.intermediate.dense.bias of shape (3072,) 2022-03-17 13:40:22,396.396 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.weight loaded from bert.decoder.layer.1.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:40:22,396.396 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.bias loaded from bert.decoder.layer.1.output.LayerNorm.bias of shape (768,) 2022-03-17 13:40:22,396.396 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.weight loaded from bert.decoder.layer.1.output.LayerNorm.weight of shape (768,) 2022-03-17 13:40:22,396.396 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.bias loaded from bert.decoder.layer.1.output.dense.bias of shape (768,) 2022-03-17 13:40:22,396.396 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.weight loaded from bert.decoder.layer.1.output.dense.weight of shape (768, 3072) 2022-03-17 13:40:22,396.396 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.bias loaded from bert.decoder.layer.2.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:40:22,396.396 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.weight loaded from bert.decoder.layer.2.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:40:22,397.397 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.bias loaded from bert.decoder.layer.2.attention.output.dense.bias of shape (768,) 2022-03-17 13:40:22,397.397 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.weight loaded from bert.decoder.layer.2.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:40:22,397.397 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.bias loaded from bert.decoder.layer.2.attention.self.key.bias of shape (768,) 2022-03-17 13:40:22,397.397 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.weight loaded from bert.decoder.layer.2.attention.self.key.weight of shape (768, 768) 2022-03-17 13:40:22,397.397 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.bias loaded from bert.decoder.layer.2.attention.self.query.bias of shape (768,) 2022-03-17 13:40:22,397.397 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.weight loaded from bert.decoder.layer.2.attention.self.query.weight of shape (768, 768) 2022-03-17 13:40:22,397.397 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.bias loaded from bert.decoder.layer.2.attention.self.value.bias of shape (768,) 2022-03-17 13:40:22,397.397 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.weight loaded from bert.decoder.layer.2.attention.self.value.weight of shape (768, 768) 2022-03-17 13:40:22,397.397 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.bias loaded from bert.decoder.layer.2.intermediate.dense.bias of shape (3072,) 2022-03-17 13:40:22,397.397 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.weight loaded from bert.decoder.layer.2.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:40:22,397.397 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.bias loaded from bert.decoder.layer.2.output.LayerNorm.bias of shape (768,) 2022-03-17 13:40:22,397.397 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.weight loaded from bert.decoder.layer.2.output.LayerNorm.weight of shape (768,) 2022-03-17 13:40:22,397.397 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.bias loaded from bert.decoder.layer.2.output.dense.bias of shape (768,) 2022-03-17 13:40:22,397.397 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.weight loaded from bert.decoder.layer.2.output.dense.weight of shape (768, 3072) 2022-03-17 13:40:22,398.398 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.bias loaded from bert.decoder.layer.3.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:40:22,398.398 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.weight loaded from bert.decoder.layer.3.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:40:22,398.398 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.bias loaded from bert.decoder.layer.3.attention.output.dense.bias of shape (768,) 2022-03-17 13:40:22,398.398 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.weight loaded from bert.decoder.layer.3.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:40:22,398.398 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.bias loaded from bert.decoder.layer.3.attention.self.key.bias of shape (768,) 2022-03-17 13:40:22,398.398 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.weight loaded from bert.decoder.layer.3.attention.self.key.weight of shape (768, 768) 2022-03-17 13:40:22,398.398 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.bias loaded from bert.decoder.layer.3.attention.self.query.bias of shape (768,) 2022-03-17 13:40:22,398.398 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.weight loaded from bert.decoder.layer.3.attention.self.query.weight of shape (768, 768) 2022-03-17 13:40:22,398.398 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.bias loaded from bert.decoder.layer.3.attention.self.value.bias of shape (768,) 2022-03-17 13:40:22,398.398 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.weight loaded from bert.decoder.layer.3.attention.self.value.weight of shape (768, 768) 2022-03-17 13:40:22,398.398 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.bias loaded from bert.decoder.layer.3.intermediate.dense.bias of shape (3072,) 2022-03-17 13:40:22,398.398 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.weight loaded from bert.decoder.layer.3.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:40:22,398.398 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.bias loaded from bert.decoder.layer.3.output.LayerNorm.bias of shape (768,) 2022-03-17 13:40:22,398.398 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.weight loaded from bert.decoder.layer.3.output.LayerNorm.weight of shape (768,) 2022-03-17 13:40:22,399.399 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.bias loaded from bert.decoder.layer.3.output.dense.bias of shape (768,) 2022-03-17 13:40:22,399.399 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.weight loaded from bert.decoder.layer.3.output.dense.weight of shape (768, 3072) 2022-03-17 13:40:22,399.399 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.bias loaded from bert.embeddings.LayerNorm.bias of shape (768,) 2022-03-17 13:40:22,399.399 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.weight loaded from bert.embeddings.LayerNorm.weight of shape (768,) 2022-03-17 13:40:22,399.399 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.position_embeddings.weight loaded from bert.embeddings.position_embeddings.weight of shape (512, 768) 2022-03-17 13:40:22,399.399 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.token_type_embeddings.weight loaded from bert.embeddings.token_type_embeddings.weight of shape (2, 768) 2022-03-17 13:40:22,399.399 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.word_embeddings.weight loaded from bert.embeddings.word_embeddings.weight of shape (30522, 768) 2022-03-17 13:40:22,399.399 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.bias loaded from bert.encoder.blocks.0.attn.proj.bias of shape (768,) 2022-03-17 13:40:22,399.399 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.weight loaded from bert.encoder.blocks.0.attn.proj.weight of shape (768, 768) 2022-03-17 13:40:22,399.399 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.bias loaded from bert.encoder.blocks.0.attn.qkv.bias of shape (2304,) 2022-03-17 13:40:22,399.399 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.weight loaded from bert.encoder.blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:40:22,399.399 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.bias loaded from bert.encoder.blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-17 13:40:22,399.399 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.weight loaded from bert.encoder.blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:40:22,400.400 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.bias loaded from bert.encoder.blocks.0.mlp.fc2.bias of shape (768,) 2022-03-17 13:40:22,400.400 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.weight loaded from bert.encoder.blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:40:22,400.400 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.bias loaded from bert.encoder.blocks.0.norm1.bias of shape (768,) 2022-03-17 13:40:22,400.400 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.weight loaded from bert.encoder.blocks.0.norm1.weight of shape (768,) 2022-03-17 13:40:22,400.400 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.bias loaded from bert.encoder.blocks.0.norm2.bias of shape (768,) 2022-03-17 13:40:22,400.400 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.weight loaded from bert.encoder.blocks.0.norm2.weight of shape (768,) 2022-03-17 13:40:22,400.400 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.bias loaded from bert.encoder.blocks.1.attn.proj.bias of shape (768,) 2022-03-17 13:40:22,400.400 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.weight loaded from bert.encoder.blocks.1.attn.proj.weight of shape (768, 768) 2022-03-17 13:40:22,400.400 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.bias loaded from bert.encoder.blocks.1.attn.qkv.bias of shape (2304,) 2022-03-17 13:40:22,400.400 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.weight loaded from bert.encoder.blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:40:22,400.400 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.bias loaded from bert.encoder.blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-17 13:40:22,400.400 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.weight loaded from bert.encoder.blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:40:22,400.400 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.bias loaded from bert.encoder.blocks.1.mlp.fc2.bias of shape (768,) 2022-03-17 13:40:22,401.401 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.weight loaded from bert.encoder.blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:40:22,401.401 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.bias loaded from bert.encoder.blocks.1.norm1.bias of shape (768,) 2022-03-17 13:40:22,401.401 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.weight loaded from bert.encoder.blocks.1.norm1.weight of shape (768,) 2022-03-17 13:40:22,401.401 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.bias loaded from bert.encoder.blocks.1.norm2.bias of shape (768,) 2022-03-17 13:40:22,401.401 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.weight loaded from bert.encoder.blocks.1.norm2.weight of shape (768,) 2022-03-17 13:40:22,401.401 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.bias loaded from bert.encoder.blocks.10.attn.proj.bias of shape (768,) 2022-03-17 13:40:22,401.401 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.weight loaded from bert.encoder.blocks.10.attn.proj.weight of shape (768, 768) 2022-03-17 13:40:22,401.401 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.bias loaded from bert.encoder.blocks.10.attn.qkv.bias of shape (2304,) 2022-03-17 13:40:22,401.401 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.weight loaded from bert.encoder.blocks.10.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:40:22,401.401 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.bias loaded from bert.encoder.blocks.10.mlp.fc1.bias of shape (3072,) 2022-03-17 13:40:22,401.401 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.weight loaded from bert.encoder.blocks.10.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:40:22,401.401 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.bias loaded from bert.encoder.blocks.10.mlp.fc2.bias of shape (768,) 2022-03-17 13:40:22,401.401 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.weight loaded from bert.encoder.blocks.10.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:40:22,401.401 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.bias loaded from bert.encoder.blocks.10.norm1.bias of shape (768,) 2022-03-17 13:40:22,402.402 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.weight loaded from bert.encoder.blocks.10.norm1.weight of shape (768,) 2022-03-17 13:40:22,402.402 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.bias loaded from bert.encoder.blocks.10.norm2.bias of shape (768,) 2022-03-17 13:40:22,402.402 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.weight loaded from bert.encoder.blocks.10.norm2.weight of shape (768,) 2022-03-17 13:40:22,402.402 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.bias loaded from bert.encoder.blocks.11.attn.proj.bias of shape (768,) 2022-03-17 13:40:22,402.402 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.weight loaded from bert.encoder.blocks.11.attn.proj.weight of shape (768, 768) 2022-03-17 13:40:22,402.402 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.bias loaded from bert.encoder.blocks.11.attn.qkv.bias of shape (2304,) 2022-03-17 13:40:22,402.402 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.weight loaded from bert.encoder.blocks.11.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:40:22,402.402 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.bias loaded from bert.encoder.blocks.11.mlp.fc1.bias of shape (3072,) 2022-03-17 13:40:22,402.402 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.weight loaded from bert.encoder.blocks.11.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:40:22,402.402 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.bias loaded from bert.encoder.blocks.11.mlp.fc2.bias of shape (768,) 2022-03-17 13:40:22,402.402 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.weight loaded from bert.encoder.blocks.11.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:40:22,402.402 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.bias loaded from bert.encoder.blocks.11.norm1.bias of shape (768,) 2022-03-17 13:40:22,402.402 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.weight loaded from bert.encoder.blocks.11.norm1.weight of shape (768,) 2022-03-17 13:40:22,402.402 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.bias loaded from bert.encoder.blocks.11.norm2.bias of shape (768,) 2022-03-17 13:40:22,403.403 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.weight loaded from bert.encoder.blocks.11.norm2.weight of shape (768,) 2022-03-17 13:40:22,403.403 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.bias loaded from bert.encoder.blocks.2.attn.proj.bias of shape (768,) 2022-03-17 13:40:22,403.403 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.weight loaded from bert.encoder.blocks.2.attn.proj.weight of shape (768, 768) 2022-03-17 13:40:22,403.403 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.bias loaded from bert.encoder.blocks.2.attn.qkv.bias of shape (2304,) 2022-03-17 13:40:22,403.403 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.weight loaded from bert.encoder.blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:40:22,403.403 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.bias loaded from bert.encoder.blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-17 13:40:22,403.403 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.weight loaded from bert.encoder.blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:40:22,403.403 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.bias loaded from bert.encoder.blocks.2.mlp.fc2.bias of shape (768,) 2022-03-17 13:40:22,403.403 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.weight loaded from bert.encoder.blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:40:22,403.403 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.bias loaded from bert.encoder.blocks.2.norm1.bias of shape (768,) 2022-03-17 13:40:22,403.403 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.weight loaded from bert.encoder.blocks.2.norm1.weight of shape (768,) 2022-03-17 13:40:22,403.403 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.bias loaded from bert.encoder.blocks.2.norm2.bias of shape (768,) 2022-03-17 13:40:22,403.403 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.weight loaded from bert.encoder.blocks.2.norm2.weight of shape (768,) 2022-03-17 13:40:22,404.404 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.bias loaded from bert.encoder.blocks.3.attn.proj.bias of shape (768,) 2022-03-17 13:40:22,404.404 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.weight loaded from bert.encoder.blocks.3.attn.proj.weight of shape (768, 768) 2022-03-17 13:40:22,404.404 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.bias loaded from bert.encoder.blocks.3.attn.qkv.bias of shape (2304,) 2022-03-17 13:40:22,404.404 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.weight loaded from bert.encoder.blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:40:22,404.404 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.bias loaded from bert.encoder.blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-17 13:40:22,404.404 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.weight loaded from bert.encoder.blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:40:22,404.404 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.bias loaded from bert.encoder.blocks.3.mlp.fc2.bias of shape (768,) 2022-03-17 13:40:22,404.404 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.weight loaded from bert.encoder.blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:40:22,404.404 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.bias loaded from bert.encoder.blocks.3.norm1.bias of shape (768,) 2022-03-17 13:40:22,404.404 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.weight loaded from bert.encoder.blocks.3.norm1.weight of shape (768,) 2022-03-17 13:40:22,404.404 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.bias loaded from bert.encoder.blocks.3.norm2.bias of shape (768,) 2022-03-17 13:40:22,404.404 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.weight loaded from bert.encoder.blocks.3.norm2.weight of shape (768,) 2022-03-17 13:40:22,404.404 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.bias loaded from bert.encoder.blocks.4.attn.proj.bias of shape (768,) 2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.weight loaded from bert.encoder.blocks.4.attn.proj.weight of shape (768, 768) 2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.bias loaded from bert.encoder.blocks.4.attn.qkv.bias of shape (2304,) 2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.weight loaded from bert.encoder.blocks.4.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.bias loaded from bert.encoder.blocks.4.mlp.fc1.bias of shape (3072,) 2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.weight loaded from bert.encoder.blocks.4.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.bias loaded from bert.encoder.blocks.4.mlp.fc2.bias of shape (768,) 2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.weight loaded from bert.encoder.blocks.4.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.bias loaded from bert.encoder.blocks.4.norm1.bias of shape (768,) 2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.weight loaded from bert.encoder.blocks.4.norm1.weight of shape (768,) 2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.bias loaded from bert.encoder.blocks.4.norm2.bias of shape (768,) 2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.weight loaded from bert.encoder.blocks.4.norm2.weight of shape (768,) 2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.bias loaded from bert.encoder.blocks.5.attn.proj.bias of shape (768,) 2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.weight loaded from bert.encoder.blocks.5.attn.proj.weight of shape (768, 768) 2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.bias loaded from bert.encoder.blocks.5.attn.qkv.bias of shape (2304,) 2022-03-17 13:40:22,405.405 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.weight loaded from bert.encoder.blocks.5.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:40:22,406.406 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.bias loaded from bert.encoder.blocks.5.mlp.fc1.bias of shape (3072,) 2022-03-17 13:40:22,406.406 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.weight loaded from bert.encoder.blocks.5.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:40:22,406.406 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.bias loaded from bert.encoder.blocks.5.mlp.fc2.bias of shape (768,) 2022-03-17 13:40:22,406.406 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.weight loaded from bert.encoder.blocks.5.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:40:22,406.406 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.bias loaded from bert.encoder.blocks.5.norm1.bias of shape (768,) 2022-03-17 13:40:22,406.406 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.weight loaded from bert.encoder.blocks.5.norm1.weight of shape (768,) 2022-03-17 13:40:22,406.406 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.bias loaded from bert.encoder.blocks.5.norm2.bias of shape (768,) 2022-03-17 13:40:22,406.406 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.weight loaded from bert.encoder.blocks.5.norm2.weight of shape (768,) 2022-03-17 13:40:22,406.406 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.bias loaded from bert.encoder.blocks.6.attn.proj.bias of shape (768,) 2022-03-17 13:40:22,406.406 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.weight loaded from bert.encoder.blocks.6.attn.proj.weight of shape (768, 768) 2022-03-17 13:40:22,406.406 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.bias loaded from bert.encoder.blocks.6.attn.qkv.bias of shape (2304,) 2022-03-17 13:40:22,406.406 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.weight loaded from bert.encoder.blocks.6.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:40:22,406.406 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.bias loaded from bert.encoder.blocks.6.mlp.fc1.bias of shape (3072,) 2022-03-17 13:40:22,407.407 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.weight loaded from bert.encoder.blocks.6.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:40:22,407.407 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.bias loaded from bert.encoder.blocks.6.mlp.fc2.bias of shape (768,) 2022-03-17 13:40:22,407.407 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.weight loaded from bert.encoder.blocks.6.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:40:22,407.407 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.bias loaded from bert.encoder.blocks.6.norm1.bias of shape (768,) 2022-03-17 13:40:22,407.407 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.weight loaded from bert.encoder.blocks.6.norm1.weight of shape (768,) 2022-03-17 13:40:22,407.407 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.bias loaded from bert.encoder.blocks.6.norm2.bias of shape (768,) 2022-03-17 13:40:22,407.407 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.weight loaded from bert.encoder.blocks.6.norm2.weight of shape (768,) 2022-03-17 13:40:22,407.407 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.bias loaded from bert.encoder.blocks.7.attn.proj.bias of shape (768,) 2022-03-17 13:40:22,407.407 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.weight loaded from bert.encoder.blocks.7.attn.proj.weight of shape (768, 768) 2022-03-17 13:40:22,407.407 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.bias loaded from bert.encoder.blocks.7.attn.qkv.bias of shape (2304,) 2022-03-17 13:40:22,407.407 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.weight loaded from bert.encoder.blocks.7.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:40:22,407.407 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.bias loaded from bert.encoder.blocks.7.mlp.fc1.bias of shape (3072,) 2022-03-17 13:40:22,407.407 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.weight loaded from bert.encoder.blocks.7.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:40:22,408.408 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.bias loaded from bert.encoder.blocks.7.mlp.fc2.bias of shape (768,) 2022-03-17 13:40:22,408.408 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.weight loaded from bert.encoder.blocks.7.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:40:22,408.408 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.bias loaded from bert.encoder.blocks.7.norm1.bias of shape (768,) 2022-03-17 13:40:22,408.408 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.weight loaded from bert.encoder.blocks.7.norm1.weight of shape (768,) 2022-03-17 13:40:22,408.408 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.bias loaded from bert.encoder.blocks.7.norm2.bias of shape (768,) 2022-03-17 13:40:22,408.408 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.weight loaded from bert.encoder.blocks.7.norm2.weight of shape (768,) 2022-03-17 13:40:22,408.408 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.bias loaded from bert.encoder.blocks.8.attn.proj.bias of shape (768,) 2022-03-17 13:40:22,408.408 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.weight loaded from bert.encoder.blocks.8.attn.proj.weight of shape (768, 768) 2022-03-17 13:40:22,408.408 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.bias loaded from bert.encoder.blocks.8.attn.qkv.bias of shape (2304,) 2022-03-17 13:40:22,408.408 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.weight loaded from bert.encoder.blocks.8.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:40:22,408.408 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.bias loaded from bert.encoder.blocks.8.mlp.fc1.bias of shape (3072,) 2022-03-17 13:40:22,408.408 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.weight loaded from bert.encoder.blocks.8.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:40:22,408.408 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.bias loaded from bert.encoder.blocks.8.mlp.fc2.bias of shape (768,) 2022-03-17 13:40:22,408.408 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.weight loaded from bert.encoder.blocks.8.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:40:22,409.409 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.bias loaded from bert.encoder.blocks.8.norm1.bias of shape (768,) 2022-03-17 13:40:22,409.409 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.weight loaded from bert.encoder.blocks.8.norm1.weight of shape (768,) 2022-03-17 13:40:22,409.409 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.bias loaded from bert.encoder.blocks.8.norm2.bias of shape (768,) 2022-03-17 13:40:22,409.409 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.weight loaded from bert.encoder.blocks.8.norm2.weight of shape (768,) 2022-03-17 13:40:22,409.409 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.bias loaded from bert.encoder.blocks.9.attn.proj.bias of shape (768,) 2022-03-17 13:40:22,409.409 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.weight loaded from bert.encoder.blocks.9.attn.proj.weight of shape (768, 768) 2022-03-17 13:40:22,409.409 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.bias loaded from bert.encoder.blocks.9.attn.qkv.bias of shape (2304,) 2022-03-17 13:40:22,409.409 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.weight loaded from bert.encoder.blocks.9.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:40:22,409.409 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.bias loaded from bert.encoder.blocks.9.mlp.fc1.bias of shape (3072,) 2022-03-17 13:40:22,409.409 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.weight loaded from bert.encoder.blocks.9.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:40:22,409.409 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.bias loaded from bert.encoder.blocks.9.mlp.fc2.bias of shape (768,) 2022-03-17 13:40:22,409.409 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.weight loaded from bert.encoder.blocks.9.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:40:22,409.409 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.bias loaded from bert.encoder.blocks.9.norm1.bias of shape (768,) 2022-03-17 13:40:22,409.409 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.weight loaded from bert.encoder.blocks.9.norm1.weight of shape (768,) 2022-03-17 13:40:22,410.410 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.bias loaded from bert.encoder.blocks.9.norm2.bias of shape (768,) 2022-03-17 13:40:22,410.410 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.weight loaded from bert.encoder.blocks.9.norm2.weight of shape (768,) 2022-03-17 13:40:22,410.410 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.bias loaded from bert.encoder.tag_blocks.0.attn.proj.bias of shape (768,) 2022-03-17 13:40:22,410.410 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.weight loaded from bert.encoder.tag_blocks.0.attn.proj.weight of shape (768, 768) 2022-03-17 13:40:22,410.410 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.bias loaded from bert.encoder.tag_blocks.0.attn.qkv.bias of shape (2304,) 2022-03-17 13:40:22,410.410 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.weight loaded from bert.encoder.tag_blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:40:22,410.410 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.bias loaded from bert.encoder.tag_blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-17 13:40:22,410.410 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.weight loaded from bert.encoder.tag_blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:40:22,410.410 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.bias loaded from bert.encoder.tag_blocks.0.mlp.fc2.bias of shape (768,) 2022-03-17 13:40:22,410.410 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.weight loaded from bert.encoder.tag_blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:40:22,410.410 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.bias loaded from bert.encoder.tag_blocks.0.norm1.bias of shape (768,) 2022-03-17 13:40:22,410.410 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.weight loaded from bert.encoder.tag_blocks.0.norm1.weight of shape (768,) 2022-03-17 13:40:22,410.410 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.bias loaded from bert.encoder.tag_blocks.0.norm2.bias of shape (768,) 2022-03-17 13:40:22,410.410 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.weight loaded from bert.encoder.tag_blocks.0.norm2.weight of shape (768,) 2022-03-17 13:40:22,411.411 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.bias loaded from bert.encoder.tag_blocks.1.attn.proj.bias of shape (768,) 2022-03-17 13:40:22,411.411 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.weight loaded from bert.encoder.tag_blocks.1.attn.proj.weight of shape (768, 768) 2022-03-17 13:40:22,411.411 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.bias loaded from bert.encoder.tag_blocks.1.attn.qkv.bias of shape (2304,) 2022-03-17 13:40:22,411.411 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.weight loaded from bert.encoder.tag_blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:40:22,411.411 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.bias loaded from bert.encoder.tag_blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-17 13:40:22,411.411 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.weight loaded from bert.encoder.tag_blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:40:22,411.411 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.bias loaded from bert.encoder.tag_blocks.1.mlp.fc2.bias of shape (768,) 2022-03-17 13:40:22,411.411 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.weight loaded from bert.encoder.tag_blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:40:22,411.411 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.bias loaded from bert.encoder.tag_blocks.1.norm1.bias of shape (768,) 2022-03-17 13:40:22,411.411 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.weight loaded from bert.encoder.tag_blocks.1.norm1.weight of shape (768,) 2022-03-17 13:40:22,411.411 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.bias loaded from bert.encoder.tag_blocks.1.norm2.bias of shape (768,) 2022-03-17 13:40:22,411.411 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.weight loaded from bert.encoder.tag_blocks.1.norm2.weight of shape (768,) 2022-03-17 13:40:22,411.411 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.bias loaded from bert.encoder.tag_blocks.2.attn.proj.bias of shape (768,) 2022-03-17 13:40:22,412.412 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.weight loaded from bert.encoder.tag_blocks.2.attn.proj.weight of shape (768, 768) 2022-03-17 13:40:22,412.412 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.bias loaded from bert.encoder.tag_blocks.2.attn.qkv.bias of shape (2304,) 2022-03-17 13:40:22,412.412 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.weight loaded from bert.encoder.tag_blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:40:22,412.412 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.bias loaded from bert.encoder.tag_blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-17 13:40:22,412.412 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.weight loaded from bert.encoder.tag_blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:40:22,412.412 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.bias loaded from bert.encoder.tag_blocks.2.mlp.fc2.bias of shape (768,) 2022-03-17 13:40:22,412.412 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.weight loaded from bert.encoder.tag_blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:40:22,412.412 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.bias loaded from bert.encoder.tag_blocks.2.norm1.bias of shape (768,) 2022-03-17 13:40:22,412.412 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.weight loaded from bert.encoder.tag_blocks.2.norm1.weight of shape (768,) 2022-03-17 13:40:22,412.412 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.bias loaded from bert.encoder.tag_blocks.2.norm2.bias of shape (768,) 2022-03-17 13:40:22,412.412 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.weight loaded from bert.encoder.tag_blocks.2.norm2.weight of shape (768,) 2022-03-17 13:40:22,412.412 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.bias loaded from bert.encoder.tag_blocks.3.attn.proj.bias of shape (768,) 2022-03-17 13:40:22,412.412 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.weight loaded from bert.encoder.tag_blocks.3.attn.proj.weight of shape (768, 768) 2022-03-17 13:40:22,412.412 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.bias loaded from bert.encoder.tag_blocks.3.attn.qkv.bias of shape (2304,) 2022-03-17 13:40:22,413.413 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.weight loaded from bert.encoder.tag_blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:40:22,413.413 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.bias loaded from bert.encoder.tag_blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-17 13:40:22,413.413 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.weight loaded from bert.encoder.tag_blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:40:22,413.413 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.bias loaded from bert.encoder.tag_blocks.3.mlp.fc2.bias of shape (768,) 2022-03-17 13:40:22,413.413 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.weight loaded from bert.encoder.tag_blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:40:22,413.413 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.bias loaded from bert.encoder.tag_blocks.3.norm1.bias of shape (768,) 2022-03-17 13:40:22,413.413 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.weight loaded from bert.encoder.tag_blocks.3.norm1.weight of shape (768,) 2022-03-17 13:40:22,413.413 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.bias loaded from bert.encoder.tag_blocks.3.norm2.bias of shape (768,) 2022-03-17 13:40:22,413.413 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.weight loaded from bert.encoder.tag_blocks.3.norm2.weight of shape (768,) 2022-03-17 13:40:22,413.413 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.bias loaded from bert.extra_embeddings.LayerNorm.bias of shape (768,) 2022-03-17 13:40:22,413.413 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.weight loaded from bert.extra_embeddings.LayerNorm.weight of shape (768,) 2022-03-17 13:40:22,413.413 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.position_embeddings.weight loaded from bert.extra_embeddings.position_embeddings.weight of shape (512, 768) 2022-03-17 13:40:22,413.413 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.token_type_embeddings.weight loaded from bert.extra_embeddings.token_type_embeddings.weight of shape (2, 768) 2022-03-17 13:40:22,414.414 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.word_embeddings.weight loaded from bert.extra_embeddings.word_embeddings.weight of shape (30522, 768) 2022-03-17 13:40:22,414.414 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.bias loaded from bert.pooler.dense.bias of shape (768,) 2022-03-17 13:40:22,414.414 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.weight loaded from bert.pooler.dense.weight of shape (768, 768) 2022-03-17 13:40:22,414.414 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.bias loaded from bert.tag_logit.predictions.bias of shape (30522,) 2022-03-17 13:40:22,414.414 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.decoder.weight loaded from bert.tag_logit.predictions.decoder.weight of shape (30522, 768) 2022-03-17 13:40:22,414.414 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.bias loaded from bert.tag_logit.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-17 13:40:22,414.414 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.weight loaded from bert.tag_logit.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-17 13:40:22,414.414 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.bias loaded from bert.tag_logit.predictions.transform.dense.bias of shape (768,) 2022-03-17 13:40:22,414.414 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.weight loaded from bert.tag_logit.predictions.transform.dense.weight of shape (768, 768) 2022-03-17 13:40:22,414.414 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.bias loaded from cls.predictions.bias of shape (30522,) 2022-03-17 13:40:22,414.414 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.decoder.weight loaded from cls.predictions.decoder.weight of shape (30522, 768) 2022-03-17 13:40:22,414.414 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.bias loaded from cls.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-17 13:40:22,414.414 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.weight loaded from cls.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-17 13:40:22,415.415 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.bias loaded from cls.predictions.transform.dense.bias of shape (768,) 2022-03-17 13:40:22,415.415 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.weight loaded from cls.predictions.transform.dense.weight of shape (768, 768) 2022-03-17 13:40:22,415.415 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 288; loaded = 288 2022-03-17 13:40:22,415.415 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = [] 2022-03-17 13:40:22,418.418 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = ['module.cls.predictions.decoder.weight']; total = 1 2022-03-17 13:40:22,582.582 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.cls.predictions.decoder.weight'] 2022-03-17 13:40:22,906.906 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = ; data = TaxCocoCaption; split = test; add_key = False; backend = cv; hold_buffer = 0; save_original = False 2022-03-17 13:40:22,931.931 2829:uni_pipeline.py:509 get_data_loader(): sampler = 2022-03-17 13:40:22,931.931 2829:uni_pipeline.py:1000 predict(): writing output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0015000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv 2022-03-17 13:40:23,003.003 2829:uni_pipeline.py:907 predict_iter(): DatasetPlusTransform(dataset=, transform=Compose( Compose( ImageTransform2Dict(image_transform=Compose( ToPILImage() Resize(size=384, interpolation=PIL.Image.BICUBIC) CenterCrop(size=(384, 384)) ToTensor() Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) )) ) LoadLabel(data=TaxCocoCaption, split=test, version=vinvl) TransCaptionTensorizer(tensorizer=, pad_to_max=True, pad_image_to_max=True) )) uni_pipeline.py:908: 0%| | 0/10 [00:00 locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi pip.monitor_train() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train need_wait_models = self.pred_eval_intermediate_models() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models pred = self.ensure_predict(model_file=model_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict self.predict(model_file, predict_result_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict model = self.get_model(is_train=False) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. --- Logging error --- Traceback (most recent call last): File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit msg = self.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format return fmt.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format record.message = record.getMessage() File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage msg = str(self.msg) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__ return str(self.to_json_string()) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n" File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps **kw).encode(obj) File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode chunks = list(chunks) File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode yield from _iterencode_dict(o, _current_indent_level) File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict yield from chunks File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode o = _default(o) File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default raise TypeError(f'Object of type {o.__class__.__name__} ' TypeError: Object of type BertTokenizer is not JSON serializable Call stack: File "src/qd/pipeline.py", line 1368, in locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi pip.monitor_train() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train need_wait_models = self.pred_eval_intermediate_models() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models pred = self.ensure_predict(model_file=model_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict self.predict(model_file, predict_result_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict model = self.get_model(is_train=False) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. 2022-03-17 13:42:33,536.536 2829:modeling_bert.py:529 __init__(): TIMM Split image encoder load from pre-trained: True 2022-03-17 13:42:34,594.594 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:42:36,145.145 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:42:36,795.795 2829:modeling_bert.py:2677 __init__(): BertImgModel Image Dimension: 2054 2022-03-17 13:42:36,943.943 2829:modeling_bert.py:2688 __init__(): BertImgModel Image Dimension: 2054 2022-03-17 13:42:37,732.732 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight! Note that this might be replaced by pre-trained checkpoint later! 2022-03-17 13:42:37,732.732 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode. 2022-03-17 13:42:38,870.870 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:42:39,407.407 2829:checkpoint.py:240 load(): Loading checkpoint from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt 2022-03-17 13:42:47,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.cls_token loaded from image_encoder.module.cls_token of shape (1, 1, 768) 2022-03-17 13:42:47,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.bias loaded from image_encoder.module.head.bias of shape (1000,) 2022-03-17 13:42:47,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.weight loaded from image_encoder.module.head.weight of shape (1000, 768) 2022-03-17 13:42:47,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.bias loaded from image_encoder.module.patch_embed.proj.bias of shape (768,) 2022-03-17 13:42:47,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.weight loaded from image_encoder.module.patch_embed.proj.weight of shape (768, 3, 16, 16) 2022-03-17 13:42:47,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.pos_embed loaded from image_encoder.module.pos_embed of shape (1, 577, 768) 2022-03-17 13:42:47,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.bias loaded from bert.caption_pooler.dense.bias of shape (768,) 2022-03-17 13:42:47,788.788 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.weight loaded from bert.caption_pooler.dense.weight of shape (768, 768) 2022-03-17 13:42:47,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.bias loaded from bert.decoder.layer.0.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:42:47,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.weight loaded from bert.decoder.layer.0.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:42:47,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.bias loaded from bert.decoder.layer.0.attention.output.dense.bias of shape (768,) 2022-03-17 13:42:47,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.weight loaded from bert.decoder.layer.0.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:42:47,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.bias loaded from bert.decoder.layer.0.attention.self.key.bias of shape (768,) 2022-03-17 13:42:47,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.weight loaded from bert.decoder.layer.0.attention.self.key.weight of shape (768, 768) 2022-03-17 13:42:47,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.bias loaded from bert.decoder.layer.0.attention.self.query.bias of shape (768,) 2022-03-17 13:42:47,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.weight loaded from bert.decoder.layer.0.attention.self.query.weight of shape (768, 768) 2022-03-17 13:42:47,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.bias loaded from bert.decoder.layer.0.attention.self.value.bias of shape (768,) 2022-03-17 13:42:47,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.weight loaded from bert.decoder.layer.0.attention.self.value.weight of shape (768, 768) 2022-03-17 13:42:47,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.bias loaded from bert.decoder.layer.0.intermediate.dense.bias of shape (3072,) 2022-03-17 13:42:47,789.789 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.weight loaded from bert.decoder.layer.0.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:42:47,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.bias loaded from bert.decoder.layer.0.output.LayerNorm.bias of shape (768,) 2022-03-17 13:42:47,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.weight loaded from bert.decoder.layer.0.output.LayerNorm.weight of shape (768,) 2022-03-17 13:42:47,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.bias loaded from bert.decoder.layer.0.output.dense.bias of shape (768,) 2022-03-17 13:42:47,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.weight loaded from bert.decoder.layer.0.output.dense.weight of shape (768, 3072) 2022-03-17 13:42:47,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.bias loaded from bert.decoder.layer.1.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:42:47,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.weight loaded from bert.decoder.layer.1.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:42:47,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.bias loaded from bert.decoder.layer.1.attention.output.dense.bias of shape (768,) 2022-03-17 13:42:47,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.weight loaded from bert.decoder.layer.1.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:42:47,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.bias loaded from bert.decoder.layer.1.attention.self.key.bias of shape (768,) 2022-03-17 13:42:47,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.weight loaded from bert.decoder.layer.1.attention.self.key.weight of shape (768, 768) 2022-03-17 13:42:47,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.bias loaded from bert.decoder.layer.1.attention.self.query.bias of shape (768,) 2022-03-17 13:42:47,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.weight loaded from bert.decoder.layer.1.attention.self.query.weight of shape (768, 768) 2022-03-17 13:42:47,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.bias loaded from bert.decoder.layer.1.attention.self.value.bias of shape (768,) 2022-03-17 13:42:47,790.790 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.weight loaded from bert.decoder.layer.1.attention.self.value.weight of shape (768, 768) 2022-03-17 13:42:47,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.bias loaded from bert.decoder.layer.1.intermediate.dense.bias of shape (3072,) 2022-03-17 13:42:47,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.weight loaded from bert.decoder.layer.1.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:42:47,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.bias loaded from bert.decoder.layer.1.output.LayerNorm.bias of shape (768,) 2022-03-17 13:42:47,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.weight loaded from bert.decoder.layer.1.output.LayerNorm.weight of shape (768,) 2022-03-17 13:42:47,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.bias loaded from bert.decoder.layer.1.output.dense.bias of shape (768,) 2022-03-17 13:42:47,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.weight loaded from bert.decoder.layer.1.output.dense.weight of shape (768, 3072) 2022-03-17 13:42:47,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.bias loaded from bert.decoder.layer.2.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:42:47,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.weight loaded from bert.decoder.layer.2.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:42:47,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.bias loaded from bert.decoder.layer.2.attention.output.dense.bias of shape (768,) 2022-03-17 13:42:47,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.weight loaded from bert.decoder.layer.2.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:42:47,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.bias loaded from bert.decoder.layer.2.attention.self.key.bias of shape (768,) 2022-03-17 13:42:47,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.weight loaded from bert.decoder.layer.2.attention.self.key.weight of shape (768, 768) 2022-03-17 13:42:47,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.bias loaded from bert.decoder.layer.2.attention.self.query.bias of shape (768,) 2022-03-17 13:42:47,791.791 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.weight loaded from bert.decoder.layer.2.attention.self.query.weight of shape (768, 768) 2022-03-17 13:42:47,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.bias loaded from bert.decoder.layer.2.attention.self.value.bias of shape (768,) 2022-03-17 13:42:47,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.weight loaded from bert.decoder.layer.2.attention.self.value.weight of shape (768, 768) 2022-03-17 13:42:47,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.bias loaded from bert.decoder.layer.2.intermediate.dense.bias of shape (3072,) 2022-03-17 13:42:47,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.weight loaded from bert.decoder.layer.2.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:42:47,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.bias loaded from bert.decoder.layer.2.output.LayerNorm.bias of shape (768,) 2022-03-17 13:42:47,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.weight loaded from bert.decoder.layer.2.output.LayerNorm.weight of shape (768,) 2022-03-17 13:42:47,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.bias loaded from bert.decoder.layer.2.output.dense.bias of shape (768,) 2022-03-17 13:42:47,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.weight loaded from bert.decoder.layer.2.output.dense.weight of shape (768, 3072) 2022-03-17 13:42:47,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.bias loaded from bert.decoder.layer.3.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:42:47,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.weight loaded from bert.decoder.layer.3.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:42:47,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.bias loaded from bert.decoder.layer.3.attention.output.dense.bias of shape (768,) 2022-03-17 13:42:47,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.weight loaded from bert.decoder.layer.3.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:42:47,792.792 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.bias loaded from bert.decoder.layer.3.attention.self.key.bias of shape (768,) 2022-03-17 13:42:47,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.weight loaded from bert.decoder.layer.3.attention.self.key.weight of shape (768, 768) 2022-03-17 13:42:47,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.bias loaded from bert.decoder.layer.3.attention.self.query.bias of shape (768,) 2022-03-17 13:42:47,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.weight loaded from bert.decoder.layer.3.attention.self.query.weight of shape (768, 768) 2022-03-17 13:42:47,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.bias loaded from bert.decoder.layer.3.attention.self.value.bias of shape (768,) 2022-03-17 13:42:47,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.weight loaded from bert.decoder.layer.3.attention.self.value.weight of shape (768, 768) 2022-03-17 13:42:47,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.bias loaded from bert.decoder.layer.3.intermediate.dense.bias of shape (3072,) 2022-03-17 13:42:47,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.weight loaded from bert.decoder.layer.3.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:42:47,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.bias loaded from bert.decoder.layer.3.output.LayerNorm.bias of shape (768,) 2022-03-17 13:42:47,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.weight loaded from bert.decoder.layer.3.output.LayerNorm.weight of shape (768,) 2022-03-17 13:42:47,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.bias loaded from bert.decoder.layer.3.output.dense.bias of shape (768,) 2022-03-17 13:42:47,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.weight loaded from bert.decoder.layer.3.output.dense.weight of shape (768, 3072) 2022-03-17 13:42:47,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.bias loaded from bert.embeddings.LayerNorm.bias of shape (768,) 2022-03-17 13:42:47,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.weight loaded from bert.embeddings.LayerNorm.weight of shape (768,) 2022-03-17 13:42:47,793.793 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.position_embeddings.weight loaded from bert.embeddings.position_embeddings.weight of shape (512, 768) 2022-03-17 13:42:47,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.token_type_embeddings.weight loaded from bert.embeddings.token_type_embeddings.weight of shape (2, 768) 2022-03-17 13:42:47,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.word_embeddings.weight loaded from bert.embeddings.word_embeddings.weight of shape (30522, 768) 2022-03-17 13:42:47,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.bias loaded from bert.encoder.blocks.0.attn.proj.bias of shape (768,) 2022-03-17 13:42:47,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.weight loaded from bert.encoder.blocks.0.attn.proj.weight of shape (768, 768) 2022-03-17 13:42:47,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.bias loaded from bert.encoder.blocks.0.attn.qkv.bias of shape (2304,) 2022-03-17 13:42:47,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.weight loaded from bert.encoder.blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:42:47,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.bias loaded from bert.encoder.blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-17 13:42:47,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.weight loaded from bert.encoder.blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:42:47,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.bias loaded from bert.encoder.blocks.0.mlp.fc2.bias of shape (768,) 2022-03-17 13:42:47,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.weight loaded from bert.encoder.blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:42:47,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.bias loaded from bert.encoder.blocks.0.norm1.bias of shape (768,) 2022-03-17 13:42:47,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.weight loaded from bert.encoder.blocks.0.norm1.weight of shape (768,) 2022-03-17 13:42:47,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.bias loaded from bert.encoder.blocks.0.norm2.bias of shape (768,) 2022-03-17 13:42:47,794.794 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.weight loaded from bert.encoder.blocks.0.norm2.weight of shape (768,) 2022-03-17 13:42:47,795.795 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.bias loaded from bert.encoder.blocks.1.attn.proj.bias of shape (768,) 2022-03-17 13:42:47,795.795 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.weight loaded from bert.encoder.blocks.1.attn.proj.weight of shape (768, 768) 2022-03-17 13:42:47,795.795 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.bias loaded from bert.encoder.blocks.1.attn.qkv.bias of shape (2304,) 2022-03-17 13:42:47,795.795 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.weight loaded from bert.encoder.blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:42:47,795.795 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.bias loaded from bert.encoder.blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-17 13:42:47,795.795 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.weight loaded from bert.encoder.blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:42:47,795.795 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.bias loaded from bert.encoder.blocks.1.mlp.fc2.bias of shape (768,) 2022-03-17 13:42:47,795.795 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.weight loaded from bert.encoder.blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:42:47,795.795 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.bias loaded from bert.encoder.blocks.1.norm1.bias of shape (768,) 2022-03-17 13:42:47,795.795 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.weight loaded from bert.encoder.blocks.1.norm1.weight of shape (768,) 2022-03-17 13:42:47,795.795 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.bias loaded from bert.encoder.blocks.1.norm2.bias of shape (768,) 2022-03-17 13:42:47,795.795 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.weight loaded from bert.encoder.blocks.1.norm2.weight of shape (768,) 2022-03-17 13:42:47,795.795 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.bias loaded from bert.encoder.blocks.10.attn.proj.bias of shape (768,) 2022-03-17 13:42:47,795.795 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.weight loaded from bert.encoder.blocks.10.attn.proj.weight of shape (768, 768) 2022-03-17 13:42:47,796.796 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.bias loaded from bert.encoder.blocks.10.attn.qkv.bias of shape (2304,) 2022-03-17 13:42:47,796.796 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.weight loaded from bert.encoder.blocks.10.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:42:47,796.796 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.bias loaded from bert.encoder.blocks.10.mlp.fc1.bias of shape (3072,) 2022-03-17 13:42:47,796.796 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.weight loaded from bert.encoder.blocks.10.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:42:47,796.796 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.bias loaded from bert.encoder.blocks.10.mlp.fc2.bias of shape (768,) 2022-03-17 13:42:47,796.796 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.weight loaded from bert.encoder.blocks.10.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:42:47,796.796 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.bias loaded from bert.encoder.blocks.10.norm1.bias of shape (768,) 2022-03-17 13:42:47,796.796 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.weight loaded from bert.encoder.blocks.10.norm1.weight of shape (768,) 2022-03-17 13:42:47,796.796 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.bias loaded from bert.encoder.blocks.10.norm2.bias of shape (768,) 2022-03-17 13:42:47,796.796 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.weight loaded from bert.encoder.blocks.10.norm2.weight of shape (768,) 2022-03-17 13:42:47,796.796 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.bias loaded from bert.encoder.blocks.11.attn.proj.bias of shape (768,) 2022-03-17 13:42:47,796.796 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.weight loaded from bert.encoder.blocks.11.attn.proj.weight of shape (768, 768) 2022-03-17 13:42:47,796.796 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.bias loaded from bert.encoder.blocks.11.attn.qkv.bias of shape (2304,) 2022-03-17 13:42:47,797.797 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.weight loaded from bert.encoder.blocks.11.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:42:47,797.797 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.bias loaded from bert.encoder.blocks.11.mlp.fc1.bias of shape (3072,) 2022-03-17 13:42:47,797.797 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.weight loaded from bert.encoder.blocks.11.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:42:47,797.797 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.bias loaded from bert.encoder.blocks.11.mlp.fc2.bias of shape (768,) 2022-03-17 13:42:47,797.797 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.weight loaded from bert.encoder.blocks.11.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:42:47,797.797 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.bias loaded from bert.encoder.blocks.11.norm1.bias of shape (768,) 2022-03-17 13:42:47,797.797 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.weight loaded from bert.encoder.blocks.11.norm1.weight of shape (768,) 2022-03-17 13:42:47,797.797 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.bias loaded from bert.encoder.blocks.11.norm2.bias of shape (768,) 2022-03-17 13:42:47,797.797 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.weight loaded from bert.encoder.blocks.11.norm2.weight of shape (768,) 2022-03-17 13:42:47,797.797 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.bias loaded from bert.encoder.blocks.2.attn.proj.bias of shape (768,) 2022-03-17 13:42:47,797.797 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.weight loaded from bert.encoder.blocks.2.attn.proj.weight of shape (768, 768) 2022-03-17 13:42:47,797.797 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.bias loaded from bert.encoder.blocks.2.attn.qkv.bias of shape (2304,) 2022-03-17 13:42:47,797.797 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.weight loaded from bert.encoder.blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:42:47,797.797 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.bias loaded from bert.encoder.blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-17 13:42:47,798.798 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.weight loaded from bert.encoder.blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:42:47,798.798 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.bias loaded from bert.encoder.blocks.2.mlp.fc2.bias of shape (768,) 2022-03-17 13:42:47,798.798 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.weight loaded from bert.encoder.blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:42:47,798.798 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.bias loaded from bert.encoder.blocks.2.norm1.bias of shape (768,) 2022-03-17 13:42:47,798.798 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.weight loaded from bert.encoder.blocks.2.norm1.weight of shape (768,) 2022-03-17 13:42:47,798.798 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.bias loaded from bert.encoder.blocks.2.norm2.bias of shape (768,) 2022-03-17 13:42:47,798.798 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.weight loaded from bert.encoder.blocks.2.norm2.weight of shape (768,) 2022-03-17 13:42:47,798.798 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.bias loaded from bert.encoder.blocks.3.attn.proj.bias of shape (768,) 2022-03-17 13:42:47,798.798 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.weight loaded from bert.encoder.blocks.3.attn.proj.weight of shape (768, 768) 2022-03-17 13:42:47,798.798 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.bias loaded from bert.encoder.blocks.3.attn.qkv.bias of shape (2304,) 2022-03-17 13:42:47,798.798 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.weight loaded from bert.encoder.blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:42:47,798.798 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.bias loaded from bert.encoder.blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-17 13:42:47,798.798 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.weight loaded from bert.encoder.blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:42:47,798.798 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.bias loaded from bert.encoder.blocks.3.mlp.fc2.bias of shape (768,) 2022-03-17 13:42:47,799.799 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.weight loaded from bert.encoder.blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:42:47,799.799 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.bias loaded from bert.encoder.blocks.3.norm1.bias of shape (768,) 2022-03-17 13:42:47,799.799 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.weight loaded from bert.encoder.blocks.3.norm1.weight of shape (768,) 2022-03-17 13:42:47,799.799 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.bias loaded from bert.encoder.blocks.3.norm2.bias of shape (768,) 2022-03-17 13:42:47,799.799 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.weight loaded from bert.encoder.blocks.3.norm2.weight of shape (768,) 2022-03-17 13:42:47,799.799 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.bias loaded from bert.encoder.blocks.4.attn.proj.bias of shape (768,) 2022-03-17 13:42:47,799.799 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.weight loaded from bert.encoder.blocks.4.attn.proj.weight of shape (768, 768) 2022-03-17 13:42:47,799.799 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.bias loaded from bert.encoder.blocks.4.attn.qkv.bias of shape (2304,) 2022-03-17 13:42:47,799.799 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.weight loaded from bert.encoder.blocks.4.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:42:47,799.799 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.bias loaded from bert.encoder.blocks.4.mlp.fc1.bias of shape (3072,) 2022-03-17 13:42:47,799.799 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.weight loaded from bert.encoder.blocks.4.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:42:47,799.799 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.bias loaded from bert.encoder.blocks.4.mlp.fc2.bias of shape (768,) 2022-03-17 13:42:47,799.799 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.weight loaded from bert.encoder.blocks.4.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:42:47,799.799 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.bias loaded from bert.encoder.blocks.4.norm1.bias of shape (768,) 2022-03-17 13:42:47,800.800 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.weight loaded from bert.encoder.blocks.4.norm1.weight of shape (768,) 2022-03-17 13:42:47,800.800 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.bias loaded from bert.encoder.blocks.4.norm2.bias of shape (768,) 2022-03-17 13:42:47,800.800 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.weight loaded from bert.encoder.blocks.4.norm2.weight of shape (768,) 2022-03-17 13:42:47,800.800 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.bias loaded from bert.encoder.blocks.5.attn.proj.bias of shape (768,) 2022-03-17 13:42:47,800.800 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.weight loaded from bert.encoder.blocks.5.attn.proj.weight of shape (768, 768) 2022-03-17 13:42:47,800.800 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.bias loaded from bert.encoder.blocks.5.attn.qkv.bias of shape (2304,) 2022-03-17 13:42:47,800.800 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.weight loaded from bert.encoder.blocks.5.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:42:47,800.800 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.bias loaded from bert.encoder.blocks.5.mlp.fc1.bias of shape (3072,) 2022-03-17 13:42:47,800.800 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.weight loaded from bert.encoder.blocks.5.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:42:47,800.800 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.bias loaded from bert.encoder.blocks.5.mlp.fc2.bias of shape (768,) 2022-03-17 13:42:47,800.800 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.weight loaded from bert.encoder.blocks.5.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:42:47,800.800 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.bias loaded from bert.encoder.blocks.5.norm1.bias of shape (768,) 2022-03-17 13:42:47,800.800 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.weight loaded from bert.encoder.blocks.5.norm1.weight of shape (768,) 2022-03-17 13:42:47,801.801 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.bias loaded from bert.encoder.blocks.5.norm2.bias of shape (768,) 2022-03-17 13:42:47,801.801 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.weight loaded from bert.encoder.blocks.5.norm2.weight of shape (768,) 2022-03-17 13:42:47,801.801 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.bias loaded from bert.encoder.blocks.6.attn.proj.bias of shape (768,) 2022-03-17 13:42:47,801.801 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.weight loaded from bert.encoder.blocks.6.attn.proj.weight of shape (768, 768) 2022-03-17 13:42:47,801.801 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.bias loaded from bert.encoder.blocks.6.attn.qkv.bias of shape (2304,) 2022-03-17 13:42:47,801.801 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.weight loaded from bert.encoder.blocks.6.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:42:47,801.801 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.bias loaded from bert.encoder.blocks.6.mlp.fc1.bias of shape (3072,) 2022-03-17 13:42:47,801.801 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.weight loaded from bert.encoder.blocks.6.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:42:47,801.801 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.bias loaded from bert.encoder.blocks.6.mlp.fc2.bias of shape (768,) 2022-03-17 13:42:47,801.801 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.weight loaded from bert.encoder.blocks.6.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:42:47,801.801 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.bias loaded from bert.encoder.blocks.6.norm1.bias of shape (768,) 2022-03-17 13:42:47,801.801 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.weight loaded from bert.encoder.blocks.6.norm1.weight of shape (768,) 2022-03-17 13:42:47,801.801 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.bias loaded from bert.encoder.blocks.6.norm2.bias of shape (768,) 2022-03-17 13:42:47,801.801 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.weight loaded from bert.encoder.blocks.6.norm2.weight of shape (768,) 2022-03-17 13:42:47,802.802 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.bias loaded from bert.encoder.blocks.7.attn.proj.bias of shape (768,) 2022-03-17 13:42:47,802.802 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.weight loaded from bert.encoder.blocks.7.attn.proj.weight of shape (768, 768) 2022-03-17 13:42:47,802.802 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.bias loaded from bert.encoder.blocks.7.attn.qkv.bias of shape (2304,) 2022-03-17 13:42:47,802.802 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.weight loaded from bert.encoder.blocks.7.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:42:47,802.802 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.bias loaded from bert.encoder.blocks.7.mlp.fc1.bias of shape (3072,) 2022-03-17 13:42:47,802.802 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.weight loaded from bert.encoder.blocks.7.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:42:47,802.802 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.bias loaded from bert.encoder.blocks.7.mlp.fc2.bias of shape (768,) 2022-03-17 13:42:47,802.802 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.weight loaded from bert.encoder.blocks.7.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:42:47,802.802 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.bias loaded from bert.encoder.blocks.7.norm1.bias of shape (768,) 2022-03-17 13:42:47,802.802 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.weight loaded from bert.encoder.blocks.7.norm1.weight of shape (768,) 2022-03-17 13:42:47,802.802 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.bias loaded from bert.encoder.blocks.7.norm2.bias of shape (768,) 2022-03-17 13:42:47,802.802 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.weight loaded from bert.encoder.blocks.7.norm2.weight of shape (768,) 2022-03-17 13:42:47,802.802 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.bias loaded from bert.encoder.blocks.8.attn.proj.bias of shape (768,) 2022-03-17 13:42:47,802.802 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.weight loaded from bert.encoder.blocks.8.attn.proj.weight of shape (768, 768) 2022-03-17 13:42:47,803.803 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.bias loaded from bert.encoder.blocks.8.attn.qkv.bias of shape (2304,) 2022-03-17 13:42:47,803.803 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.weight loaded from bert.encoder.blocks.8.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:42:47,803.803 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.bias loaded from bert.encoder.blocks.8.mlp.fc1.bias of shape (3072,) 2022-03-17 13:42:47,803.803 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.weight loaded from bert.encoder.blocks.8.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:42:47,803.803 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.bias loaded from bert.encoder.blocks.8.mlp.fc2.bias of shape (768,) 2022-03-17 13:42:47,803.803 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.weight loaded from bert.encoder.blocks.8.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:42:47,803.803 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.bias loaded from bert.encoder.blocks.8.norm1.bias of shape (768,) 2022-03-17 13:42:47,803.803 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.weight loaded from bert.encoder.blocks.8.norm1.weight of shape (768,) 2022-03-17 13:42:47,803.803 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.bias loaded from bert.encoder.blocks.8.norm2.bias of shape (768,) 2022-03-17 13:42:47,803.803 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.weight loaded from bert.encoder.blocks.8.norm2.weight of shape (768,) 2022-03-17 13:42:47,803.803 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.bias loaded from bert.encoder.blocks.9.attn.proj.bias of shape (768,) 2022-03-17 13:42:47,803.803 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.weight loaded from bert.encoder.blocks.9.attn.proj.weight of shape (768, 768) 2022-03-17 13:42:47,803.803 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.bias loaded from bert.encoder.blocks.9.attn.qkv.bias of shape (2304,) 2022-03-17 13:42:47,804.804 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.weight loaded from bert.encoder.blocks.9.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:42:47,804.804 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.bias loaded from bert.encoder.blocks.9.mlp.fc1.bias of shape (3072,) 2022-03-17 13:42:47,804.804 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.weight loaded from bert.encoder.blocks.9.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:42:47,804.804 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.bias loaded from bert.encoder.blocks.9.mlp.fc2.bias of shape (768,) 2022-03-17 13:42:47,804.804 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.weight loaded from bert.encoder.blocks.9.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:42:47,804.804 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.bias loaded from bert.encoder.blocks.9.norm1.bias of shape (768,) 2022-03-17 13:42:47,804.804 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.weight loaded from bert.encoder.blocks.9.norm1.weight of shape (768,) 2022-03-17 13:42:47,804.804 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.bias loaded from bert.encoder.blocks.9.norm2.bias of shape (768,) 2022-03-17 13:42:47,804.804 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.weight loaded from bert.encoder.blocks.9.norm2.weight of shape (768,) 2022-03-17 13:42:47,804.804 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.bias loaded from bert.encoder.tag_blocks.0.attn.proj.bias of shape (768,) 2022-03-17 13:42:47,805.805 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.weight loaded from bert.encoder.tag_blocks.0.attn.proj.weight of shape (768, 768) 2022-03-17 13:42:47,805.805 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.bias loaded from bert.encoder.tag_blocks.0.attn.qkv.bias of shape (2304,) 2022-03-17 13:42:47,805.805 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.weight loaded from bert.encoder.tag_blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:42:47,805.805 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.bias loaded from bert.encoder.tag_blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-17 13:42:47,805.805 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.weight loaded from bert.encoder.tag_blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:42:47,805.805 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.bias loaded from bert.encoder.tag_blocks.0.mlp.fc2.bias of shape (768,) 2022-03-17 13:42:47,805.805 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.weight loaded from bert.encoder.tag_blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:42:47,805.805 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.bias loaded from bert.encoder.tag_blocks.0.norm1.bias of shape (768,) 2022-03-17 13:42:47,805.805 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.weight loaded from bert.encoder.tag_blocks.0.norm1.weight of shape (768,) 2022-03-17 13:42:47,805.805 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.bias loaded from bert.encoder.tag_blocks.0.norm2.bias of shape (768,) 2022-03-17 13:42:47,805.805 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.weight loaded from bert.encoder.tag_blocks.0.norm2.weight of shape (768,) 2022-03-17 13:42:47,805.805 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.bias loaded from bert.encoder.tag_blocks.1.attn.proj.bias of shape (768,) 2022-03-17 13:42:47,805.805 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.weight loaded from bert.encoder.tag_blocks.1.attn.proj.weight of shape (768, 768) 2022-03-17 13:42:47,806.806 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.bias loaded from bert.encoder.tag_blocks.1.attn.qkv.bias of shape (2304,) 2022-03-17 13:42:47,806.806 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.weight loaded from bert.encoder.tag_blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:42:47,806.806 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.bias loaded from bert.encoder.tag_blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-17 13:42:47,806.806 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.weight loaded from bert.encoder.tag_blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:42:47,806.806 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.bias loaded from bert.encoder.tag_blocks.1.mlp.fc2.bias of shape (768,) 2022-03-17 13:42:47,806.806 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.weight loaded from bert.encoder.tag_blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:42:47,806.806 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.bias loaded from bert.encoder.tag_blocks.1.norm1.bias of shape (768,) 2022-03-17 13:42:47,806.806 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.weight loaded from bert.encoder.tag_blocks.1.norm1.weight of shape (768,) 2022-03-17 13:42:47,806.806 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.bias loaded from bert.encoder.tag_blocks.1.norm2.bias of shape (768,) 2022-03-17 13:42:47,806.806 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.weight loaded from bert.encoder.tag_blocks.1.norm2.weight of shape (768,) 2022-03-17 13:42:47,806.806 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.bias loaded from bert.encoder.tag_blocks.2.attn.proj.bias of shape (768,) 2022-03-17 13:42:47,806.806 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.weight loaded from bert.encoder.tag_blocks.2.attn.proj.weight of shape (768, 768) 2022-03-17 13:42:47,806.806 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.bias loaded from bert.encoder.tag_blocks.2.attn.qkv.bias of shape (2304,) 2022-03-17 13:42:47,806.806 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.weight loaded from bert.encoder.tag_blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.bias loaded from bert.encoder.tag_blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.weight loaded from bert.encoder.tag_blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.bias loaded from bert.encoder.tag_blocks.2.mlp.fc2.bias of shape (768,) 2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.weight loaded from bert.encoder.tag_blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.bias loaded from bert.encoder.tag_blocks.2.norm1.bias of shape (768,) 2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.weight loaded from bert.encoder.tag_blocks.2.norm1.weight of shape (768,) 2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.bias loaded from bert.encoder.tag_blocks.2.norm2.bias of shape (768,) 2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.weight loaded from bert.encoder.tag_blocks.2.norm2.weight of shape (768,) 2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.bias loaded from bert.encoder.tag_blocks.3.attn.proj.bias of shape (768,) 2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.weight loaded from bert.encoder.tag_blocks.3.attn.proj.weight of shape (768, 768) 2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.bias loaded from bert.encoder.tag_blocks.3.attn.qkv.bias of shape (2304,) 2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.weight loaded from bert.encoder.tag_blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.bias loaded from bert.encoder.tag_blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.weight loaded from bert.encoder.tag_blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:42:47,807.807 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.bias loaded from bert.encoder.tag_blocks.3.mlp.fc2.bias of shape (768,) 2022-03-17 13:42:47,808.808 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.weight loaded from bert.encoder.tag_blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:42:47,808.808 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.bias loaded from bert.encoder.tag_blocks.3.norm1.bias of shape (768,) 2022-03-17 13:42:47,808.808 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.weight loaded from bert.encoder.tag_blocks.3.norm1.weight of shape (768,) 2022-03-17 13:42:47,808.808 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.bias loaded from bert.encoder.tag_blocks.3.norm2.bias of shape (768,) 2022-03-17 13:42:47,808.808 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.weight loaded from bert.encoder.tag_blocks.3.norm2.weight of shape (768,) 2022-03-17 13:42:47,808.808 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.bias loaded from bert.extra_embeddings.LayerNorm.bias of shape (768,) 2022-03-17 13:42:47,808.808 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.weight loaded from bert.extra_embeddings.LayerNorm.weight of shape (768,) 2022-03-17 13:42:47,808.808 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.position_embeddings.weight loaded from bert.extra_embeddings.position_embeddings.weight of shape (512, 768) 2022-03-17 13:42:47,808.808 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.token_type_embeddings.weight loaded from bert.extra_embeddings.token_type_embeddings.weight of shape (2, 768) 2022-03-17 13:42:47,808.808 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.word_embeddings.weight loaded from bert.extra_embeddings.word_embeddings.weight of shape (30522, 768) 2022-03-17 13:42:47,808.808 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.bias loaded from bert.pooler.dense.bias of shape (768,) 2022-03-17 13:42:47,808.808 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.weight loaded from bert.pooler.dense.weight of shape (768, 768) 2022-03-17 13:42:47,808.808 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.bias loaded from bert.tag_logit.predictions.bias of shape (30522,) 2022-03-17 13:42:47,808.808 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.decoder.weight loaded from bert.tag_logit.predictions.decoder.weight of shape (30522, 768) 2022-03-17 13:42:47,809.809 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.bias loaded from bert.tag_logit.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-17 13:42:47,809.809 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.weight loaded from bert.tag_logit.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-17 13:42:47,809.809 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.bias loaded from bert.tag_logit.predictions.transform.dense.bias of shape (768,) 2022-03-17 13:42:47,809.809 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.weight loaded from bert.tag_logit.predictions.transform.dense.weight of shape (768, 768) 2022-03-17 13:42:47,809.809 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.bias loaded from cls.predictions.bias of shape (30522,) 2022-03-17 13:42:47,809.809 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.decoder.weight loaded from cls.predictions.decoder.weight of shape (30522, 768) 2022-03-17 13:42:47,809.809 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.bias loaded from cls.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-17 13:42:47,809.809 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.weight loaded from cls.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-17 13:42:47,809.809 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.bias loaded from cls.predictions.transform.dense.bias of shape (768,) 2022-03-17 13:42:47,809.809 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.weight loaded from cls.predictions.transform.dense.weight of shape (768, 768) 2022-03-17 13:42:47,809.809 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 288; loaded = 288 2022-03-17 13:42:47,809.809 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = [] 2022-03-17 13:42:47,813.813 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = ['module.cls.predictions.decoder.weight']; total = 1 2022-03-17 13:42:47,978.978 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.cls.predictions.decoder.weight'] 2022-03-17 13:42:48,303.303 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = ; data = TaxCocoCaption; split = test; add_key = False; backend = cv; hold_buffer = 0; save_original = False 2022-03-17 13:42:48,330.330 2829:uni_pipeline.py:509 get_data_loader(): sampler = 2022-03-17 13:42:48,330.330 2829:uni_pipeline.py:1000 predict(): writing output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0020000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv 2022-03-17 13:42:48,402.402 2829:uni_pipeline.py:907 predict_iter(): DatasetPlusTransform(dataset=, transform=Compose( Compose( ImageTransform2Dict(image_transform=Compose( ToPILImage() Resize(size=384, interpolation=PIL.Image.BICUBIC) CenterCrop(size=(384, 384)) ToTensor() Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) )) ) LoadLabel(data=TaxCocoCaption, split=test, version=vinvl) TransCaptionTensorizer(tensorizer=, pad_to_max=True, pad_image_to_max=True) )) uni_pipeline.py:908: 0%| | 0/10 [00:00 locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi pip.monitor_train() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train need_wait_models = self.pred_eval_intermediate_models() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models pred = self.ensure_predict(model_file=model_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict self.predict(model_file, predict_result_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict model = self.get_model(is_train=False) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. --- Logging error --- Traceback (most recent call last): File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit msg = self.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format return fmt.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format record.message = record.getMessage() File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage msg = str(self.msg) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__ return str(self.to_json_string()) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n" File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps **kw).encode(obj) File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode chunks = list(chunks) File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode yield from _iterencode_dict(o, _current_indent_level) File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict yield from chunks File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode o = _default(o) File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default raise TypeError(f'Object of type {o.__class__.__name__} ' TypeError: Object of type BertTokenizer is not JSON serializable Call stack: File "src/qd/pipeline.py", line 1368, in locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi pip.monitor_train() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train need_wait_models = self.pred_eval_intermediate_models() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models pred = self.ensure_predict(model_file=model_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict self.predict(model_file, predict_result_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict model = self.get_model(is_train=False) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. 2022-03-17 13:44:56,764.764 2829:modeling_bert.py:529 __init__(): TIMM Split image encoder load from pre-trained: True 2022-03-17 13:44:57,834.834 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:44:59,390.390 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:45:00,047.047 2829:modeling_bert.py:2677 __init__(): BertImgModel Image Dimension: 2054 2022-03-17 13:45:00,195.195 2829:modeling_bert.py:2688 __init__(): BertImgModel Image Dimension: 2054 2022-03-17 13:45:00,990.990 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight! Note that this might be replaced by pre-trained checkpoint later! 2022-03-17 13:45:00,990.990 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode. 2022-03-17 13:45:02,135.135 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:45:02,673.673 2829:checkpoint.py:240 load(): Loading checkpoint from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt 2022-03-17 13:45:11,003.003 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.cls_token loaded from image_encoder.module.cls_token of shape (1, 1, 768) 2022-03-17 13:45:11,003.003 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.bias loaded from image_encoder.module.head.bias of shape (1000,) 2022-03-17 13:45:11,003.003 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.weight loaded from image_encoder.module.head.weight of shape (1000, 768) 2022-03-17 13:45:11,003.003 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.bias loaded from image_encoder.module.patch_embed.proj.bias of shape (768,) 2022-03-17 13:45:11,004.004 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.weight loaded from image_encoder.module.patch_embed.proj.weight of shape (768, 3, 16, 16) 2022-03-17 13:45:11,004.004 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.pos_embed loaded from image_encoder.module.pos_embed of shape (1, 577, 768) 2022-03-17 13:45:11,004.004 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.bias loaded from bert.caption_pooler.dense.bias of shape (768,) 2022-03-17 13:45:11,004.004 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.weight loaded from bert.caption_pooler.dense.weight of shape (768, 768) 2022-03-17 13:45:11,004.004 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.bias loaded from bert.decoder.layer.0.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:45:11,004.004 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.weight loaded from bert.decoder.layer.0.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:45:11,004.004 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.bias loaded from bert.decoder.layer.0.attention.output.dense.bias of shape (768,) 2022-03-17 13:45:11,004.004 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.weight loaded from bert.decoder.layer.0.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:45:11,004.004 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.bias loaded from bert.decoder.layer.0.attention.self.key.bias of shape (768,) 2022-03-17 13:45:11,004.004 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.weight loaded from bert.decoder.layer.0.attention.self.key.weight of shape (768, 768) 2022-03-17 13:45:11,004.004 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.bias loaded from bert.decoder.layer.0.attention.self.query.bias of shape (768,) 2022-03-17 13:45:11,004.004 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.weight loaded from bert.decoder.layer.0.attention.self.query.weight of shape (768, 768) 2022-03-17 13:45:11,005.005 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.bias loaded from bert.decoder.layer.0.attention.self.value.bias of shape (768,) 2022-03-17 13:45:11,005.005 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.weight loaded from bert.decoder.layer.0.attention.self.value.weight of shape (768, 768) 2022-03-17 13:45:11,005.005 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.bias loaded from bert.decoder.layer.0.intermediate.dense.bias of shape (3072,) 2022-03-17 13:45:11,005.005 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.weight loaded from bert.decoder.layer.0.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:45:11,005.005 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.bias loaded from bert.decoder.layer.0.output.LayerNorm.bias of shape (768,) 2022-03-17 13:45:11,005.005 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.weight loaded from bert.decoder.layer.0.output.LayerNorm.weight of shape (768,) 2022-03-17 13:45:11,005.005 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.bias loaded from bert.decoder.layer.0.output.dense.bias of shape (768,) 2022-03-17 13:45:11,005.005 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.weight loaded from bert.decoder.layer.0.output.dense.weight of shape (768, 3072) 2022-03-17 13:45:11,005.005 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.bias loaded from bert.decoder.layer.1.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:45:11,005.005 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.weight loaded from bert.decoder.layer.1.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:45:11,005.005 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.bias loaded from bert.decoder.layer.1.attention.output.dense.bias of shape (768,) 2022-03-17 13:45:11,005.005 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.weight loaded from bert.decoder.layer.1.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:45:11,005.005 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.bias loaded from bert.decoder.layer.1.attention.self.key.bias of shape (768,) 2022-03-17 13:45:11,006.006 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.weight loaded from bert.decoder.layer.1.attention.self.key.weight of shape (768, 768) 2022-03-17 13:45:11,006.006 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.bias loaded from bert.decoder.layer.1.attention.self.query.bias of shape (768,) 2022-03-17 13:45:11,006.006 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.weight loaded from bert.decoder.layer.1.attention.self.query.weight of shape (768, 768) 2022-03-17 13:45:11,006.006 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.bias loaded from bert.decoder.layer.1.attention.self.value.bias of shape (768,) 2022-03-17 13:45:11,006.006 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.weight loaded from bert.decoder.layer.1.attention.self.value.weight of shape (768, 768) 2022-03-17 13:45:11,006.006 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.bias loaded from bert.decoder.layer.1.intermediate.dense.bias of shape (3072,) 2022-03-17 13:45:11,006.006 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.weight loaded from bert.decoder.layer.1.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:45:11,006.006 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.bias loaded from bert.decoder.layer.1.output.LayerNorm.bias of shape (768,) 2022-03-17 13:45:11,006.006 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.weight loaded from bert.decoder.layer.1.output.LayerNorm.weight of shape (768,) 2022-03-17 13:45:11,006.006 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.bias loaded from bert.decoder.layer.1.output.dense.bias of shape (768,) 2022-03-17 13:45:11,006.006 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.weight loaded from bert.decoder.layer.1.output.dense.weight of shape (768, 3072) 2022-03-17 13:45:11,006.006 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.bias loaded from bert.decoder.layer.2.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:45:11,006.006 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.weight loaded from bert.decoder.layer.2.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:45:11,007.007 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.bias loaded from bert.decoder.layer.2.attention.output.dense.bias of shape (768,) 2022-03-17 13:45:11,007.007 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.weight loaded from bert.decoder.layer.2.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:45:11,007.007 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.bias loaded from bert.decoder.layer.2.attention.self.key.bias of shape (768,) 2022-03-17 13:45:11,007.007 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.weight loaded from bert.decoder.layer.2.attention.self.key.weight of shape (768, 768) 2022-03-17 13:45:11,007.007 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.bias loaded from bert.decoder.layer.2.attention.self.query.bias of shape (768,) 2022-03-17 13:45:11,007.007 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.weight loaded from bert.decoder.layer.2.attention.self.query.weight of shape (768, 768) 2022-03-17 13:45:11,007.007 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.bias loaded from bert.decoder.layer.2.attention.self.value.bias of shape (768,) 2022-03-17 13:45:11,007.007 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.weight loaded from bert.decoder.layer.2.attention.self.value.weight of shape (768, 768) 2022-03-17 13:45:11,007.007 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.bias loaded from bert.decoder.layer.2.intermediate.dense.bias of shape (3072,) 2022-03-17 13:45:11,007.007 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.weight loaded from bert.decoder.layer.2.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:45:11,007.007 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.bias loaded from bert.decoder.layer.2.output.LayerNorm.bias of shape (768,) 2022-03-17 13:45:11,007.007 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.weight loaded from bert.decoder.layer.2.output.LayerNorm.weight of shape (768,) 2022-03-17 13:45:11,007.007 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.bias loaded from bert.decoder.layer.2.output.dense.bias of shape (768,) 2022-03-17 13:45:11,007.007 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.weight loaded from bert.decoder.layer.2.output.dense.weight of shape (768, 3072) 2022-03-17 13:45:11,008.008 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.bias loaded from bert.decoder.layer.3.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:45:11,008.008 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.weight loaded from bert.decoder.layer.3.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:45:11,008.008 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.bias loaded from bert.decoder.layer.3.attention.output.dense.bias of shape (768,) 2022-03-17 13:45:11,008.008 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.weight loaded from bert.decoder.layer.3.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:45:11,008.008 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.bias loaded from bert.decoder.layer.3.attention.self.key.bias of shape (768,) 2022-03-17 13:45:11,008.008 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.weight loaded from bert.decoder.layer.3.attention.self.key.weight of shape (768, 768) 2022-03-17 13:45:11,008.008 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.bias loaded from bert.decoder.layer.3.attention.self.query.bias of shape (768,) 2022-03-17 13:45:11,008.008 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.weight loaded from bert.decoder.layer.3.attention.self.query.weight of shape (768, 768) 2022-03-17 13:45:11,008.008 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.bias loaded from bert.decoder.layer.3.attention.self.value.bias of shape (768,) 2022-03-17 13:45:11,008.008 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.weight loaded from bert.decoder.layer.3.attention.self.value.weight of shape (768, 768) 2022-03-17 13:45:11,008.008 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.bias loaded from bert.decoder.layer.3.intermediate.dense.bias of shape (3072,) 2022-03-17 13:45:11,008.008 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.weight loaded from bert.decoder.layer.3.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:45:11,008.008 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.bias loaded from bert.decoder.layer.3.output.LayerNorm.bias of shape (768,) 2022-03-17 13:45:11,009.009 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.weight loaded from bert.decoder.layer.3.output.LayerNorm.weight of shape (768,) 2022-03-17 13:45:11,009.009 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.bias loaded from bert.decoder.layer.3.output.dense.bias of shape (768,) 2022-03-17 13:45:11,009.009 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.weight loaded from bert.decoder.layer.3.output.dense.weight of shape (768, 3072) 2022-03-17 13:45:11,009.009 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.bias loaded from bert.embeddings.LayerNorm.bias of shape (768,) 2022-03-17 13:45:11,009.009 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.weight loaded from bert.embeddings.LayerNorm.weight of shape (768,) 2022-03-17 13:45:11,009.009 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.position_embeddings.weight loaded from bert.embeddings.position_embeddings.weight of shape (512, 768) 2022-03-17 13:45:11,009.009 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.token_type_embeddings.weight loaded from bert.embeddings.token_type_embeddings.weight of shape (2, 768) 2022-03-17 13:45:11,009.009 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.word_embeddings.weight loaded from bert.embeddings.word_embeddings.weight of shape (30522, 768) 2022-03-17 13:45:11,009.009 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.bias loaded from bert.encoder.blocks.0.attn.proj.bias of shape (768,) 2022-03-17 13:45:11,009.009 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.weight loaded from bert.encoder.blocks.0.attn.proj.weight of shape (768, 768) 2022-03-17 13:45:11,009.009 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.bias loaded from bert.encoder.blocks.0.attn.qkv.bias of shape (2304,) 2022-03-17 13:45:11,009.009 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.weight loaded from bert.encoder.blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:45:11,009.009 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.bias loaded from bert.encoder.blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-17 13:45:11,010.010 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.weight loaded from bert.encoder.blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:45:11,010.010 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.bias loaded from bert.encoder.blocks.0.mlp.fc2.bias of shape (768,) 2022-03-17 13:45:11,010.010 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.weight loaded from bert.encoder.blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:45:11,010.010 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.bias loaded from bert.encoder.blocks.0.norm1.bias of shape (768,) 2022-03-17 13:45:11,010.010 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.weight loaded from bert.encoder.blocks.0.norm1.weight of shape (768,) 2022-03-17 13:45:11,010.010 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.bias loaded from bert.encoder.blocks.0.norm2.bias of shape (768,) 2022-03-17 13:45:11,010.010 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.weight loaded from bert.encoder.blocks.0.norm2.weight of shape (768,) 2022-03-17 13:45:11,010.010 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.bias loaded from bert.encoder.blocks.1.attn.proj.bias of shape (768,) 2022-03-17 13:45:11,010.010 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.weight loaded from bert.encoder.blocks.1.attn.proj.weight of shape (768, 768) 2022-03-17 13:45:11,010.010 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.bias loaded from bert.encoder.blocks.1.attn.qkv.bias of shape (2304,) 2022-03-17 13:45:11,010.010 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.weight loaded from bert.encoder.blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:45:11,010.010 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.bias loaded from bert.encoder.blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-17 13:45:11,010.010 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.weight loaded from bert.encoder.blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:45:11,010.010 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.bias loaded from bert.encoder.blocks.1.mlp.fc2.bias of shape (768,) 2022-03-17 13:45:11,011.011 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.weight loaded from bert.encoder.blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:45:11,011.011 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.bias loaded from bert.encoder.blocks.1.norm1.bias of shape (768,) 2022-03-17 13:45:11,011.011 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.weight loaded from bert.encoder.blocks.1.norm1.weight of shape (768,) 2022-03-17 13:45:11,011.011 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.bias loaded from bert.encoder.blocks.1.norm2.bias of shape (768,) 2022-03-17 13:45:11,011.011 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.weight loaded from bert.encoder.blocks.1.norm2.weight of shape (768,) 2022-03-17 13:45:11,011.011 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.bias loaded from bert.encoder.blocks.10.attn.proj.bias of shape (768,) 2022-03-17 13:45:11,011.011 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.weight loaded from bert.encoder.blocks.10.attn.proj.weight of shape (768, 768) 2022-03-17 13:45:11,011.011 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.bias loaded from bert.encoder.blocks.10.attn.qkv.bias of shape (2304,) 2022-03-17 13:45:11,011.011 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.weight loaded from bert.encoder.blocks.10.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:45:11,011.011 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.bias loaded from bert.encoder.blocks.10.mlp.fc1.bias of shape (3072,) 2022-03-17 13:45:11,011.011 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.weight loaded from bert.encoder.blocks.10.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:45:11,011.011 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.bias loaded from bert.encoder.blocks.10.mlp.fc2.bias of shape (768,) 2022-03-17 13:45:11,011.011 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.weight loaded from bert.encoder.blocks.10.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:45:11,011.011 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.bias loaded from bert.encoder.blocks.10.norm1.bias of shape (768,) 2022-03-17 13:45:11,012.012 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.weight loaded from bert.encoder.blocks.10.norm1.weight of shape (768,) 2022-03-17 13:45:11,012.012 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.bias loaded from bert.encoder.blocks.10.norm2.bias of shape (768,) 2022-03-17 13:45:11,012.012 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.weight loaded from bert.encoder.blocks.10.norm2.weight of shape (768,) 2022-03-17 13:45:11,012.012 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.bias loaded from bert.encoder.blocks.11.attn.proj.bias of shape (768,) 2022-03-17 13:45:11,012.012 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.weight loaded from bert.encoder.blocks.11.attn.proj.weight of shape (768, 768) 2022-03-17 13:45:11,012.012 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.bias loaded from bert.encoder.blocks.11.attn.qkv.bias of shape (2304,) 2022-03-17 13:45:11,012.012 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.weight loaded from bert.encoder.blocks.11.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:45:11,012.012 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.bias loaded from bert.encoder.blocks.11.mlp.fc1.bias of shape (3072,) 2022-03-17 13:45:11,012.012 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.weight loaded from bert.encoder.blocks.11.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:45:11,012.012 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.bias loaded from bert.encoder.blocks.11.mlp.fc2.bias of shape (768,) 2022-03-17 13:45:11,012.012 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.weight loaded from bert.encoder.blocks.11.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:45:11,012.012 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.bias loaded from bert.encoder.blocks.11.norm1.bias of shape (768,) 2022-03-17 13:45:11,012.012 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.weight loaded from bert.encoder.blocks.11.norm1.weight of shape (768,) 2022-03-17 13:45:11,012.012 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.bias loaded from bert.encoder.blocks.11.norm2.bias of shape (768,) 2022-03-17 13:45:11,013.013 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.weight loaded from bert.encoder.blocks.11.norm2.weight of shape (768,) 2022-03-17 13:45:11,013.013 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.bias loaded from bert.encoder.blocks.2.attn.proj.bias of shape (768,) 2022-03-17 13:45:11,013.013 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.weight loaded from bert.encoder.blocks.2.attn.proj.weight of shape (768, 768) 2022-03-17 13:45:11,013.013 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.bias loaded from bert.encoder.blocks.2.attn.qkv.bias of shape (2304,) 2022-03-17 13:45:11,013.013 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.weight loaded from bert.encoder.blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:45:11,013.013 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.bias loaded from bert.encoder.blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-17 13:45:11,013.013 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.weight loaded from bert.encoder.blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:45:11,013.013 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.bias loaded from bert.encoder.blocks.2.mlp.fc2.bias of shape (768,) 2022-03-17 13:45:11,013.013 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.weight loaded from bert.encoder.blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:45:11,013.013 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.bias loaded from bert.encoder.blocks.2.norm1.bias of shape (768,) 2022-03-17 13:45:11,013.013 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.weight loaded from bert.encoder.blocks.2.norm1.weight of shape (768,) 2022-03-17 13:45:11,013.013 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.bias loaded from bert.encoder.blocks.2.norm2.bias of shape (768,) 2022-03-17 13:45:11,013.013 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.weight loaded from bert.encoder.blocks.2.norm2.weight of shape (768,) 2022-03-17 13:45:11,014.014 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.bias loaded from bert.encoder.blocks.3.attn.proj.bias of shape (768,) 2022-03-17 13:45:11,014.014 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.weight loaded from bert.encoder.blocks.3.attn.proj.weight of shape (768, 768) 2022-03-17 13:45:11,014.014 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.bias loaded from bert.encoder.blocks.3.attn.qkv.bias of shape (2304,) 2022-03-17 13:45:11,014.014 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.weight loaded from bert.encoder.blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:45:11,014.014 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.bias loaded from bert.encoder.blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-17 13:45:11,014.014 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.weight loaded from bert.encoder.blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:45:11,014.014 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.bias loaded from bert.encoder.blocks.3.mlp.fc2.bias of shape (768,) 2022-03-17 13:45:11,014.014 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.weight loaded from bert.encoder.blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:45:11,014.014 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.bias loaded from bert.encoder.blocks.3.norm1.bias of shape (768,) 2022-03-17 13:45:11,014.014 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.weight loaded from bert.encoder.blocks.3.norm1.weight of shape (768,) 2022-03-17 13:45:11,014.014 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.bias loaded from bert.encoder.blocks.3.norm2.bias of shape (768,) 2022-03-17 13:45:11,014.014 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.weight loaded from bert.encoder.blocks.3.norm2.weight of shape (768,) 2022-03-17 13:45:11,014.014 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.bias loaded from bert.encoder.blocks.4.attn.proj.bias of shape (768,) 2022-03-17 13:45:11,015.015 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.weight loaded from bert.encoder.blocks.4.attn.proj.weight of shape (768, 768) 2022-03-17 13:45:11,015.015 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.bias loaded from bert.encoder.blocks.4.attn.qkv.bias of shape (2304,) 2022-03-17 13:45:11,015.015 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.weight loaded from bert.encoder.blocks.4.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:45:11,015.015 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.bias loaded from bert.encoder.blocks.4.mlp.fc1.bias of shape (3072,) 2022-03-17 13:45:11,015.015 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.weight loaded from bert.encoder.blocks.4.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:45:11,015.015 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.bias loaded from bert.encoder.blocks.4.mlp.fc2.bias of shape (768,) 2022-03-17 13:45:11,015.015 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.weight loaded from bert.encoder.blocks.4.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:45:11,015.015 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.bias loaded from bert.encoder.blocks.4.norm1.bias of shape (768,) 2022-03-17 13:45:11,015.015 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.weight loaded from bert.encoder.blocks.4.norm1.weight of shape (768,) 2022-03-17 13:45:11,015.015 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.bias loaded from bert.encoder.blocks.4.norm2.bias of shape (768,) 2022-03-17 13:45:11,015.015 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.weight loaded from bert.encoder.blocks.4.norm2.weight of shape (768,) 2022-03-17 13:45:11,015.015 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.bias loaded from bert.encoder.blocks.5.attn.proj.bias of shape (768,) 2022-03-17 13:45:11,015.015 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.weight loaded from bert.encoder.blocks.5.attn.proj.weight of shape (768, 768) 2022-03-17 13:45:11,015.015 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.bias loaded from bert.encoder.blocks.5.attn.qkv.bias of shape (2304,) 2022-03-17 13:45:11,016.016 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.weight loaded from bert.encoder.blocks.5.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:45:11,016.016 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.bias loaded from bert.encoder.blocks.5.mlp.fc1.bias of shape (3072,) 2022-03-17 13:45:11,016.016 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.weight loaded from bert.encoder.blocks.5.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:45:11,016.016 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.bias loaded from bert.encoder.blocks.5.mlp.fc2.bias of shape (768,) 2022-03-17 13:45:11,016.016 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.weight loaded from bert.encoder.blocks.5.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:45:11,016.016 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.bias loaded from bert.encoder.blocks.5.norm1.bias of shape (768,) 2022-03-17 13:45:11,016.016 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.weight loaded from bert.encoder.blocks.5.norm1.weight of shape (768,) 2022-03-17 13:45:11,016.016 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.bias loaded from bert.encoder.blocks.5.norm2.bias of shape (768,) 2022-03-17 13:45:11,016.016 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.weight loaded from bert.encoder.blocks.5.norm2.weight of shape (768,) 2022-03-17 13:45:11,016.016 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.bias loaded from bert.encoder.blocks.6.attn.proj.bias of shape (768,) 2022-03-17 13:45:11,016.016 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.weight loaded from bert.encoder.blocks.6.attn.proj.weight of shape (768, 768) 2022-03-17 13:45:11,016.016 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.bias loaded from bert.encoder.blocks.6.attn.qkv.bias of shape (2304,) 2022-03-17 13:45:11,016.016 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.weight loaded from bert.encoder.blocks.6.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:45:11,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.bias loaded from bert.encoder.blocks.6.mlp.fc1.bias of shape (3072,) 2022-03-17 13:45:11,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.weight loaded from bert.encoder.blocks.6.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:45:11,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.bias loaded from bert.encoder.blocks.6.mlp.fc2.bias of shape (768,) 2022-03-17 13:45:11,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.weight loaded from bert.encoder.blocks.6.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:45:11,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.bias loaded from bert.encoder.blocks.6.norm1.bias of shape (768,) 2022-03-17 13:45:11,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.weight loaded from bert.encoder.blocks.6.norm1.weight of shape (768,) 2022-03-17 13:45:11,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.bias loaded from bert.encoder.blocks.6.norm2.bias of shape (768,) 2022-03-17 13:45:11,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.weight loaded from bert.encoder.blocks.6.norm2.weight of shape (768,) 2022-03-17 13:45:11,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.bias loaded from bert.encoder.blocks.7.attn.proj.bias of shape (768,) 2022-03-17 13:45:11,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.weight loaded from bert.encoder.blocks.7.attn.proj.weight of shape (768, 768) 2022-03-17 13:45:11,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.bias loaded from bert.encoder.blocks.7.attn.qkv.bias of shape (2304,) 2022-03-17 13:45:11,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.weight loaded from bert.encoder.blocks.7.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:45:11,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.bias loaded from bert.encoder.blocks.7.mlp.fc1.bias of shape (3072,) 2022-03-17 13:45:11,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.weight loaded from bert.encoder.blocks.7.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:45:11,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.bias loaded from bert.encoder.blocks.7.mlp.fc2.bias of shape (768,) 2022-03-17 13:45:11,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.weight loaded from bert.encoder.blocks.7.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:45:11,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.bias loaded from bert.encoder.blocks.7.norm1.bias of shape (768,) 2022-03-17 13:45:11,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.weight loaded from bert.encoder.blocks.7.norm1.weight of shape (768,) 2022-03-17 13:45:11,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.bias loaded from bert.encoder.blocks.7.norm2.bias of shape (768,) 2022-03-17 13:45:11,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.weight loaded from bert.encoder.blocks.7.norm2.weight of shape (768,) 2022-03-17 13:45:11,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.bias loaded from bert.encoder.blocks.8.attn.proj.bias of shape (768,) 2022-03-17 13:45:11,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.weight loaded from bert.encoder.blocks.8.attn.proj.weight of shape (768, 768) 2022-03-17 13:45:11,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.bias loaded from bert.encoder.blocks.8.attn.qkv.bias of shape (2304,) 2022-03-17 13:45:11,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.weight loaded from bert.encoder.blocks.8.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:45:11,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.bias loaded from bert.encoder.blocks.8.mlp.fc1.bias of shape (3072,) 2022-03-17 13:45:11,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.weight loaded from bert.encoder.blocks.8.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:45:11,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.bias loaded from bert.encoder.blocks.8.mlp.fc2.bias of shape (768,) 2022-03-17 13:45:11,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.weight loaded from bert.encoder.blocks.8.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:45:11,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.bias loaded from bert.encoder.blocks.8.norm1.bias of shape (768,) 2022-03-17 13:45:11,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.weight loaded from bert.encoder.blocks.8.norm1.weight of shape (768,) 2022-03-17 13:45:11,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.bias loaded from bert.encoder.blocks.8.norm2.bias of shape (768,) 2022-03-17 13:45:11,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.weight loaded from bert.encoder.blocks.8.norm2.weight of shape (768,) 2022-03-17 13:45:11,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.bias loaded from bert.encoder.blocks.9.attn.proj.bias of shape (768,) 2022-03-17 13:45:11,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.weight loaded from bert.encoder.blocks.9.attn.proj.weight of shape (768, 768) 2022-03-17 13:45:11,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.bias loaded from bert.encoder.blocks.9.attn.qkv.bias of shape (2304,) 2022-03-17 13:45:11,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.weight loaded from bert.encoder.blocks.9.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:45:11,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.bias loaded from bert.encoder.blocks.9.mlp.fc1.bias of shape (3072,) 2022-03-17 13:45:11,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.weight loaded from bert.encoder.blocks.9.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:45:11,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.bias loaded from bert.encoder.blocks.9.mlp.fc2.bias of shape (768,) 2022-03-17 13:45:11,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.weight loaded from bert.encoder.blocks.9.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:45:11,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.bias loaded from bert.encoder.blocks.9.norm1.bias of shape (768,) 2022-03-17 13:45:11,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.weight loaded from bert.encoder.blocks.9.norm1.weight of shape (768,) 2022-03-17 13:45:11,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.bias loaded from bert.encoder.blocks.9.norm2.bias of shape (768,) 2022-03-17 13:45:11,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.weight loaded from bert.encoder.blocks.9.norm2.weight of shape (768,) 2022-03-17 13:45:11,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.bias loaded from bert.encoder.tag_blocks.0.attn.proj.bias of shape (768,) 2022-03-17 13:45:11,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.weight loaded from bert.encoder.tag_blocks.0.attn.proj.weight of shape (768, 768) 2022-03-17 13:45:11,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.bias loaded from bert.encoder.tag_blocks.0.attn.qkv.bias of shape (2304,) 2022-03-17 13:45:11,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.weight loaded from bert.encoder.tag_blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:45:11,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.bias loaded from bert.encoder.tag_blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-17 13:45:11,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.weight loaded from bert.encoder.tag_blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:45:11,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.bias loaded from bert.encoder.tag_blocks.0.mlp.fc2.bias of shape (768,) 2022-03-17 13:45:11,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.weight loaded from bert.encoder.tag_blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:45:11,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.bias loaded from bert.encoder.tag_blocks.0.norm1.bias of shape (768,) 2022-03-17 13:45:11,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.weight loaded from bert.encoder.tag_blocks.0.norm1.weight of shape (768,) 2022-03-17 13:45:11,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.bias loaded from bert.encoder.tag_blocks.0.norm2.bias of shape (768,) 2022-03-17 13:45:11,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.weight loaded from bert.encoder.tag_blocks.0.norm2.weight of shape (768,) 2022-03-17 13:45:11,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.bias loaded from bert.encoder.tag_blocks.1.attn.proj.bias of shape (768,) 2022-03-17 13:45:11,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.weight loaded from bert.encoder.tag_blocks.1.attn.proj.weight of shape (768, 768) 2022-03-17 13:45:11,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.bias loaded from bert.encoder.tag_blocks.1.attn.qkv.bias of shape (2304,) 2022-03-17 13:45:11,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.weight loaded from bert.encoder.tag_blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:45:11,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.bias loaded from bert.encoder.tag_blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-17 13:45:11,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.weight loaded from bert.encoder.tag_blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:45:11,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.bias loaded from bert.encoder.tag_blocks.1.mlp.fc2.bias of shape (768,) 2022-03-17 13:45:11,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.weight loaded from bert.encoder.tag_blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:45:11,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.bias loaded from bert.encoder.tag_blocks.1.norm1.bias of shape (768,) 2022-03-17 13:45:11,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.weight loaded from bert.encoder.tag_blocks.1.norm1.weight of shape (768,) 2022-03-17 13:45:11,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.bias loaded from bert.encoder.tag_blocks.1.norm2.bias of shape (768,) 2022-03-17 13:45:11,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.weight loaded from bert.encoder.tag_blocks.1.norm2.weight of shape (768,) 2022-03-17 13:45:11,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.bias loaded from bert.encoder.tag_blocks.2.attn.proj.bias of shape (768,) 2022-03-17 13:45:11,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.weight loaded from bert.encoder.tag_blocks.2.attn.proj.weight of shape (768, 768) 2022-03-17 13:45:11,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.bias loaded from bert.encoder.tag_blocks.2.attn.qkv.bias of shape (2304,) 2022-03-17 13:45:11,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.weight loaded from bert.encoder.tag_blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:45:11,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.bias loaded from bert.encoder.tag_blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-17 13:45:11,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.weight loaded from bert.encoder.tag_blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:45:11,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.bias loaded from bert.encoder.tag_blocks.2.mlp.fc2.bias of shape (768,) 2022-03-17 13:45:11,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.weight loaded from bert.encoder.tag_blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:45:11,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.bias loaded from bert.encoder.tag_blocks.2.norm1.bias of shape (768,) 2022-03-17 13:45:11,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.weight loaded from bert.encoder.tag_blocks.2.norm1.weight of shape (768,) 2022-03-17 13:45:11,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.bias loaded from bert.encoder.tag_blocks.2.norm2.bias of shape (768,) 2022-03-17 13:45:11,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.weight loaded from bert.encoder.tag_blocks.2.norm2.weight of shape (768,) 2022-03-17 13:45:11,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.bias loaded from bert.encoder.tag_blocks.3.attn.proj.bias of shape (768,) 2022-03-17 13:45:11,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.weight loaded from bert.encoder.tag_blocks.3.attn.proj.weight of shape (768, 768) 2022-03-17 13:45:11,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.bias loaded from bert.encoder.tag_blocks.3.attn.qkv.bias of shape (2304,) 2022-03-17 13:45:11,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.weight loaded from bert.encoder.tag_blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:45:11,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.bias loaded from bert.encoder.tag_blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-17 13:45:11,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.weight loaded from bert.encoder.tag_blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:45:11,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.bias loaded from bert.encoder.tag_blocks.3.mlp.fc2.bias of shape (768,) 2022-03-17 13:45:11,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.weight loaded from bert.encoder.tag_blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:45:11,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.bias loaded from bert.encoder.tag_blocks.3.norm1.bias of shape (768,) 2022-03-17 13:45:11,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.weight loaded from bert.encoder.tag_blocks.3.norm1.weight of shape (768,) 2022-03-17 13:45:11,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.bias loaded from bert.encoder.tag_blocks.3.norm2.bias of shape (768,) 2022-03-17 13:45:11,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.weight loaded from bert.encoder.tag_blocks.3.norm2.weight of shape (768,) 2022-03-17 13:45:11,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.bias loaded from bert.extra_embeddings.LayerNorm.bias of shape (768,) 2022-03-17 13:45:11,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.weight loaded from bert.extra_embeddings.LayerNorm.weight of shape (768,) 2022-03-17 13:45:11,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.position_embeddings.weight loaded from bert.extra_embeddings.position_embeddings.weight of shape (512, 768) 2022-03-17 13:45:11,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.token_type_embeddings.weight loaded from bert.extra_embeddings.token_type_embeddings.weight of shape (2, 768) 2022-03-17 13:45:11,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.word_embeddings.weight loaded from bert.extra_embeddings.word_embeddings.weight of shape (30522, 768) 2022-03-17 13:45:11,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.bias loaded from bert.pooler.dense.bias of shape (768,) 2022-03-17 13:45:11,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.weight loaded from bert.pooler.dense.weight of shape (768, 768) 2022-03-17 13:45:11,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.bias loaded from bert.tag_logit.predictions.bias of shape (30522,) 2022-03-17 13:45:11,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.decoder.weight loaded from bert.tag_logit.predictions.decoder.weight of shape (30522, 768) 2022-03-17 13:45:11,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.bias loaded from bert.tag_logit.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-17 13:45:11,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.weight loaded from bert.tag_logit.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-17 13:45:11,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.bias loaded from bert.tag_logit.predictions.transform.dense.bias of shape (768,) 2022-03-17 13:45:11,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.weight loaded from bert.tag_logit.predictions.transform.dense.weight of shape (768, 768) 2022-03-17 13:45:11,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.bias loaded from cls.predictions.bias of shape (30522,) 2022-03-17 13:45:11,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.decoder.weight loaded from cls.predictions.decoder.weight of shape (30522, 768) 2022-03-17 13:45:11,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.bias loaded from cls.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-17 13:45:11,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.weight loaded from cls.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-17 13:45:11,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.bias loaded from cls.predictions.transform.dense.bias of shape (768,) 2022-03-17 13:45:11,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.weight loaded from cls.predictions.transform.dense.weight of shape (768, 768) 2022-03-17 13:45:11,025.025 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 288; loaded = 288 2022-03-17 13:45:11,025.025 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = [] 2022-03-17 13:45:11,028.028 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = ['module.cls.predictions.decoder.weight']; total = 1 2022-03-17 13:45:11,196.196 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.cls.predictions.decoder.weight'] 2022-03-17 13:45:11,526.526 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = ; data = TaxCocoCaption; split = test; add_key = False; backend = cv; hold_buffer = 0; save_original = False 2022-03-17 13:45:11,552.552 2829:uni_pipeline.py:509 get_data_loader(): sampler = 2022-03-17 13:45:11,553.553 2829:uni_pipeline.py:1000 predict(): writing output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0025000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv 2022-03-17 13:45:11,625.625 2829:uni_pipeline.py:907 predict_iter(): DatasetPlusTransform(dataset=, transform=Compose( Compose( ImageTransform2Dict(image_transform=Compose( ToPILImage() Resize(size=384, interpolation=PIL.Image.BICUBIC) CenterCrop(size=(384, 384)) ToTensor() Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) )) ) LoadLabel(data=TaxCocoCaption, split=test, version=vinvl) TransCaptionTensorizer(tensorizer=, pad_to_max=True, pad_image_to_max=True) )) uni_pipeline.py:908: 0%| | 0/10 [00:00 locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi pip.monitor_train() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train need_wait_models = self.pred_eval_intermediate_models() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models pred = self.ensure_predict(model_file=model_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict self.predict(model_file, predict_result_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict model = self.get_model(is_train=False) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. --- Logging error --- Traceback (most recent call last): File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit msg = self.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format return fmt.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format record.message = record.getMessage() File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage msg = str(self.msg) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__ return str(self.to_json_string()) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n" File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps **kw).encode(obj) File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode chunks = list(chunks) File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode yield from _iterencode_dict(o, _current_indent_level) File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict yield from chunks File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode o = _default(o) File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default raise TypeError(f'Object of type {o.__class__.__name__} ' TypeError: Object of type BertTokenizer is not JSON serializable Call stack: File "src/qd/pipeline.py", line 1368, in locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi pip.monitor_train() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train need_wait_models = self.pred_eval_intermediate_models() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models pred = self.ensure_predict(model_file=model_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict self.predict(model_file, predict_result_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict model = self.get_model(is_train=False) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. 2022-03-17 13:47:26,121.121 2829:modeling_bert.py:529 __init__(): TIMM Split image encoder load from pre-trained: True 2022-03-17 13:47:27,192.192 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:47:28,713.713 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:47:29,361.361 2829:modeling_bert.py:2677 __init__(): BertImgModel Image Dimension: 2054 2022-03-17 13:47:29,509.509 2829:modeling_bert.py:2688 __init__(): BertImgModel Image Dimension: 2054 2022-03-17 13:47:30,301.301 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight! Note that this might be replaced by pre-trained checkpoint later! 2022-03-17 13:47:30,301.301 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode. 2022-03-17 13:47:31,440.440 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:47:31,980.980 2829:checkpoint.py:240 load(): Loading checkpoint from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt 2022-03-17 13:47:40,532.532 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.cls_token loaded from image_encoder.module.cls_token of shape (1, 1, 768) 2022-03-17 13:47:40,532.532 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.bias loaded from image_encoder.module.head.bias of shape (1000,) 2022-03-17 13:47:40,533.533 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.weight loaded from image_encoder.module.head.weight of shape (1000, 768) 2022-03-17 13:47:40,533.533 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.bias loaded from image_encoder.module.patch_embed.proj.bias of shape (768,) 2022-03-17 13:47:40,533.533 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.weight loaded from image_encoder.module.patch_embed.proj.weight of shape (768, 3, 16, 16) 2022-03-17 13:47:40,533.533 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.pos_embed loaded from image_encoder.module.pos_embed of shape (1, 577, 768) 2022-03-17 13:47:40,533.533 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.bias loaded from bert.caption_pooler.dense.bias of shape (768,) 2022-03-17 13:47:40,533.533 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.weight loaded from bert.caption_pooler.dense.weight of shape (768, 768) 2022-03-17 13:47:40,533.533 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.bias loaded from bert.decoder.layer.0.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:47:40,533.533 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.weight loaded from bert.decoder.layer.0.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:47:40,533.533 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.bias loaded from bert.decoder.layer.0.attention.output.dense.bias of shape (768,) 2022-03-17 13:47:40,533.533 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.weight loaded from bert.decoder.layer.0.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:47:40,533.533 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.bias loaded from bert.decoder.layer.0.attention.self.key.bias of shape (768,) 2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.weight loaded from bert.decoder.layer.0.attention.self.key.weight of shape (768, 768) 2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.bias loaded from bert.decoder.layer.0.attention.self.query.bias of shape (768,) 2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.weight loaded from bert.decoder.layer.0.attention.self.query.weight of shape (768, 768) 2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.bias loaded from bert.decoder.layer.0.attention.self.value.bias of shape (768,) 2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.weight loaded from bert.decoder.layer.0.attention.self.value.weight of shape (768, 768) 2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.bias loaded from bert.decoder.layer.0.intermediate.dense.bias of shape (3072,) 2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.weight loaded from bert.decoder.layer.0.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.bias loaded from bert.decoder.layer.0.output.LayerNorm.bias of shape (768,) 2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.weight loaded from bert.decoder.layer.0.output.LayerNorm.weight of shape (768,) 2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.bias loaded from bert.decoder.layer.0.output.dense.bias of shape (768,) 2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.weight loaded from bert.decoder.layer.0.output.dense.weight of shape (768, 3072) 2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.bias loaded from bert.decoder.layer.1.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.weight loaded from bert.decoder.layer.1.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.bias loaded from bert.decoder.layer.1.attention.output.dense.bias of shape (768,) 2022-03-17 13:47:40,534.534 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.weight loaded from bert.decoder.layer.1.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:47:40,535.535 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.bias loaded from bert.decoder.layer.1.attention.self.key.bias of shape (768,) 2022-03-17 13:47:40,535.535 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.weight loaded from bert.decoder.layer.1.attention.self.key.weight of shape (768, 768) 2022-03-17 13:47:40,535.535 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.bias loaded from bert.decoder.layer.1.attention.self.query.bias of shape (768,) 2022-03-17 13:47:40,535.535 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.weight loaded from bert.decoder.layer.1.attention.self.query.weight of shape (768, 768) 2022-03-17 13:47:40,535.535 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.bias loaded from bert.decoder.layer.1.attention.self.value.bias of shape (768,) 2022-03-17 13:47:40,535.535 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.weight loaded from bert.decoder.layer.1.attention.self.value.weight of shape (768, 768) 2022-03-17 13:47:40,535.535 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.bias loaded from bert.decoder.layer.1.intermediate.dense.bias of shape (3072,) 2022-03-17 13:47:40,535.535 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.weight loaded from bert.decoder.layer.1.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:47:40,535.535 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.bias loaded from bert.decoder.layer.1.output.LayerNorm.bias of shape (768,) 2022-03-17 13:47:40,535.535 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.weight loaded from bert.decoder.layer.1.output.LayerNorm.weight of shape (768,) 2022-03-17 13:47:40,535.535 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.bias loaded from bert.decoder.layer.1.output.dense.bias of shape (768,) 2022-03-17 13:47:40,535.535 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.weight loaded from bert.decoder.layer.1.output.dense.weight of shape (768, 3072) 2022-03-17 13:47:40,535.535 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.bias loaded from bert.decoder.layer.2.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:47:40,535.535 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.weight loaded from bert.decoder.layer.2.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:47:40,536.536 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.bias loaded from bert.decoder.layer.2.attention.output.dense.bias of shape (768,) 2022-03-17 13:47:40,536.536 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.weight loaded from bert.decoder.layer.2.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:47:40,536.536 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.bias loaded from bert.decoder.layer.2.attention.self.key.bias of shape (768,) 2022-03-17 13:47:40,536.536 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.weight loaded from bert.decoder.layer.2.attention.self.key.weight of shape (768, 768) 2022-03-17 13:47:40,536.536 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.bias loaded from bert.decoder.layer.2.attention.self.query.bias of shape (768,) 2022-03-17 13:47:40,536.536 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.weight loaded from bert.decoder.layer.2.attention.self.query.weight of shape (768, 768) 2022-03-17 13:47:40,536.536 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.bias loaded from bert.decoder.layer.2.attention.self.value.bias of shape (768,) 2022-03-17 13:47:40,536.536 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.weight loaded from bert.decoder.layer.2.attention.self.value.weight of shape (768, 768) 2022-03-17 13:47:40,536.536 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.bias loaded from bert.decoder.layer.2.intermediate.dense.bias of shape (3072,) 2022-03-17 13:47:40,536.536 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.weight loaded from bert.decoder.layer.2.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:47:40,536.536 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.bias loaded from bert.decoder.layer.2.output.LayerNorm.bias of shape (768,) 2022-03-17 13:47:40,536.536 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.weight loaded from bert.decoder.layer.2.output.LayerNorm.weight of shape (768,) 2022-03-17 13:47:40,536.536 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.bias loaded from bert.decoder.layer.2.output.dense.bias of shape (768,) 2022-03-17 13:47:40,536.536 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.weight loaded from bert.decoder.layer.2.output.dense.weight of shape (768, 3072) 2022-03-17 13:47:40,537.537 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.bias loaded from bert.decoder.layer.3.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:47:40,537.537 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.weight loaded from bert.decoder.layer.3.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:47:40,537.537 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.bias loaded from bert.decoder.layer.3.attention.output.dense.bias of shape (768,) 2022-03-17 13:47:40,537.537 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.weight loaded from bert.decoder.layer.3.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:47:40,537.537 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.bias loaded from bert.decoder.layer.3.attention.self.key.bias of shape (768,) 2022-03-17 13:47:40,537.537 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.weight loaded from bert.decoder.layer.3.attention.self.key.weight of shape (768, 768) 2022-03-17 13:47:40,537.537 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.bias loaded from bert.decoder.layer.3.attention.self.query.bias of shape (768,) 2022-03-17 13:47:40,537.537 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.weight loaded from bert.decoder.layer.3.attention.self.query.weight of shape (768, 768) 2022-03-17 13:47:40,537.537 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.bias loaded from bert.decoder.layer.3.attention.self.value.bias of shape (768,) 2022-03-17 13:47:40,537.537 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.weight loaded from bert.decoder.layer.3.attention.self.value.weight of shape (768, 768) 2022-03-17 13:47:40,537.537 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.bias loaded from bert.decoder.layer.3.intermediate.dense.bias of shape (3072,) 2022-03-17 13:47:40,537.537 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.weight loaded from bert.decoder.layer.3.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:47:40,537.537 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.bias loaded from bert.decoder.layer.3.output.LayerNorm.bias of shape (768,) 2022-03-17 13:47:40,537.537 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.weight loaded from bert.decoder.layer.3.output.LayerNorm.weight of shape (768,) 2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.bias loaded from bert.decoder.layer.3.output.dense.bias of shape (768,) 2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.weight loaded from bert.decoder.layer.3.output.dense.weight of shape (768, 3072) 2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.bias loaded from bert.embeddings.LayerNorm.bias of shape (768,) 2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.weight loaded from bert.embeddings.LayerNorm.weight of shape (768,) 2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.position_embeddings.weight loaded from bert.embeddings.position_embeddings.weight of shape (512, 768) 2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.token_type_embeddings.weight loaded from bert.embeddings.token_type_embeddings.weight of shape (2, 768) 2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.word_embeddings.weight loaded from bert.embeddings.word_embeddings.weight of shape (30522, 768) 2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.bias loaded from bert.encoder.blocks.0.attn.proj.bias of shape (768,) 2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.weight loaded from bert.encoder.blocks.0.attn.proj.weight of shape (768, 768) 2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.bias loaded from bert.encoder.blocks.0.attn.qkv.bias of shape (2304,) 2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.weight loaded from bert.encoder.blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.bias loaded from bert.encoder.blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.weight loaded from bert.encoder.blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.bias loaded from bert.encoder.blocks.0.mlp.fc2.bias of shape (768,) 2022-03-17 13:47:40,538.538 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.weight loaded from bert.encoder.blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:47:40,539.539 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.bias loaded from bert.encoder.blocks.0.norm1.bias of shape (768,) 2022-03-17 13:47:40,539.539 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.weight loaded from bert.encoder.blocks.0.norm1.weight of shape (768,) 2022-03-17 13:47:40,539.539 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.bias loaded from bert.encoder.blocks.0.norm2.bias of shape (768,) 2022-03-17 13:47:40,539.539 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.weight loaded from bert.encoder.blocks.0.norm2.weight of shape (768,) 2022-03-17 13:47:40,539.539 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.bias loaded from bert.encoder.blocks.1.attn.proj.bias of shape (768,) 2022-03-17 13:47:40,539.539 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.weight loaded from bert.encoder.blocks.1.attn.proj.weight of shape (768, 768) 2022-03-17 13:47:40,539.539 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.bias loaded from bert.encoder.blocks.1.attn.qkv.bias of shape (2304,) 2022-03-17 13:47:40,539.539 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.weight loaded from bert.encoder.blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:47:40,539.539 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.bias loaded from bert.encoder.blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-17 13:47:40,539.539 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.weight loaded from bert.encoder.blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:47:40,539.539 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.bias loaded from bert.encoder.blocks.1.mlp.fc2.bias of shape (768,) 2022-03-17 13:47:40,539.539 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.weight loaded from bert.encoder.blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:47:40,539.539 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.bias loaded from bert.encoder.blocks.1.norm1.bias of shape (768,) 2022-03-17 13:47:40,539.539 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.weight loaded from bert.encoder.blocks.1.norm1.weight of shape (768,) 2022-03-17 13:47:40,540.540 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.bias loaded from bert.encoder.blocks.1.norm2.bias of shape (768,) 2022-03-17 13:47:40,540.540 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.weight loaded from bert.encoder.blocks.1.norm2.weight of shape (768,) 2022-03-17 13:47:40,540.540 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.bias loaded from bert.encoder.blocks.10.attn.proj.bias of shape (768,) 2022-03-17 13:47:40,540.540 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.weight loaded from bert.encoder.blocks.10.attn.proj.weight of shape (768, 768) 2022-03-17 13:47:40,540.540 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.bias loaded from bert.encoder.blocks.10.attn.qkv.bias of shape (2304,) 2022-03-17 13:47:40,540.540 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.weight loaded from bert.encoder.blocks.10.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:47:40,540.540 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.bias loaded from bert.encoder.blocks.10.mlp.fc1.bias of shape (3072,) 2022-03-17 13:47:40,540.540 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.weight loaded from bert.encoder.blocks.10.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:47:40,540.540 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.bias loaded from bert.encoder.blocks.10.mlp.fc2.bias of shape (768,) 2022-03-17 13:47:40,540.540 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.weight loaded from bert.encoder.blocks.10.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:47:40,540.540 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.bias loaded from bert.encoder.blocks.10.norm1.bias of shape (768,) 2022-03-17 13:47:40,540.540 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.weight loaded from bert.encoder.blocks.10.norm1.weight of shape (768,) 2022-03-17 13:47:40,540.540 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.bias loaded from bert.encoder.blocks.10.norm2.bias of shape (768,) 2022-03-17 13:47:40,540.540 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.weight loaded from bert.encoder.blocks.10.norm2.weight of shape (768,) 2022-03-17 13:47:40,541.541 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.bias loaded from bert.encoder.blocks.11.attn.proj.bias of shape (768,) 2022-03-17 13:47:40,541.541 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.weight loaded from bert.encoder.blocks.11.attn.proj.weight of shape (768, 768) 2022-03-17 13:47:40,541.541 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.bias loaded from bert.encoder.blocks.11.attn.qkv.bias of shape (2304,) 2022-03-17 13:47:40,541.541 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.weight loaded from bert.encoder.blocks.11.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:47:40,541.541 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.bias loaded from bert.encoder.blocks.11.mlp.fc1.bias of shape (3072,) 2022-03-17 13:47:40,541.541 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.weight loaded from bert.encoder.blocks.11.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:47:40,541.541 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.bias loaded from bert.encoder.blocks.11.mlp.fc2.bias of shape (768,) 2022-03-17 13:47:40,541.541 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.weight loaded from bert.encoder.blocks.11.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:47:40,541.541 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.bias loaded from bert.encoder.blocks.11.norm1.bias of shape (768,) 2022-03-17 13:47:40,541.541 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.weight loaded from bert.encoder.blocks.11.norm1.weight of shape (768,) 2022-03-17 13:47:40,541.541 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.bias loaded from bert.encoder.blocks.11.norm2.bias of shape (768,) 2022-03-17 13:47:40,541.541 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.weight loaded from bert.encoder.blocks.11.norm2.weight of shape (768,) 2022-03-17 13:47:40,541.541 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.bias loaded from bert.encoder.blocks.2.attn.proj.bias of shape (768,) 2022-03-17 13:47:40,542.542 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.weight loaded from bert.encoder.blocks.2.attn.proj.weight of shape (768, 768) 2022-03-17 13:47:40,542.542 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.bias loaded from bert.encoder.blocks.2.attn.qkv.bias of shape (2304,) 2022-03-17 13:47:40,542.542 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.weight loaded from bert.encoder.blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:47:40,542.542 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.bias loaded from bert.encoder.blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-17 13:47:40,542.542 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.weight loaded from bert.encoder.blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:47:40,542.542 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.bias loaded from bert.encoder.blocks.2.mlp.fc2.bias of shape (768,) 2022-03-17 13:47:40,542.542 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.weight loaded from bert.encoder.blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:47:40,542.542 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.bias loaded from bert.encoder.blocks.2.norm1.bias of shape (768,) 2022-03-17 13:47:40,542.542 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.weight loaded from bert.encoder.blocks.2.norm1.weight of shape (768,) 2022-03-17 13:47:40,542.542 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.bias loaded from bert.encoder.blocks.2.norm2.bias of shape (768,) 2022-03-17 13:47:40,542.542 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.weight loaded from bert.encoder.blocks.2.norm2.weight of shape (768,) 2022-03-17 13:47:40,542.542 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.bias loaded from bert.encoder.blocks.3.attn.proj.bias of shape (768,) 2022-03-17 13:47:40,542.542 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.weight loaded from bert.encoder.blocks.3.attn.proj.weight of shape (768, 768) 2022-03-17 13:47:40,543.543 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.bias loaded from bert.encoder.blocks.3.attn.qkv.bias of shape (2304,) 2022-03-17 13:47:40,543.543 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.weight loaded from bert.encoder.blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:47:40,543.543 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.bias loaded from bert.encoder.blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-17 13:47:40,543.543 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.weight loaded from bert.encoder.blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:47:40,543.543 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.bias loaded from bert.encoder.blocks.3.mlp.fc2.bias of shape (768,) 2022-03-17 13:47:40,543.543 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.weight loaded from bert.encoder.blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:47:40,543.543 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.bias loaded from bert.encoder.blocks.3.norm1.bias of shape (768,) 2022-03-17 13:47:40,543.543 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.weight loaded from bert.encoder.blocks.3.norm1.weight of shape (768,) 2022-03-17 13:47:40,543.543 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.bias loaded from bert.encoder.blocks.3.norm2.bias of shape (768,) 2022-03-17 13:47:40,543.543 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.weight loaded from bert.encoder.blocks.3.norm2.weight of shape (768,) 2022-03-17 13:47:40,543.543 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.bias loaded from bert.encoder.blocks.4.attn.proj.bias of shape (768,) 2022-03-17 13:47:40,543.543 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.weight loaded from bert.encoder.blocks.4.attn.proj.weight of shape (768, 768) 2022-03-17 13:47:40,543.543 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.bias loaded from bert.encoder.blocks.4.attn.qkv.bias of shape (2304,) 2022-03-17 13:47:40,543.543 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.weight loaded from bert.encoder.blocks.4.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:47:40,544.544 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.bias loaded from bert.encoder.blocks.4.mlp.fc1.bias of shape (3072,) 2022-03-17 13:47:40,544.544 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.weight loaded from bert.encoder.blocks.4.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:47:40,544.544 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.bias loaded from bert.encoder.blocks.4.mlp.fc2.bias of shape (768,) 2022-03-17 13:47:40,544.544 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.weight loaded from bert.encoder.blocks.4.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:47:40,544.544 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.bias loaded from bert.encoder.blocks.4.norm1.bias of shape (768,) 2022-03-17 13:47:40,544.544 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.weight loaded from bert.encoder.blocks.4.norm1.weight of shape (768,) 2022-03-17 13:47:40,544.544 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.bias loaded from bert.encoder.blocks.4.norm2.bias of shape (768,) 2022-03-17 13:47:40,544.544 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.weight loaded from bert.encoder.blocks.4.norm2.weight of shape (768,) 2022-03-17 13:47:40,544.544 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.bias loaded from bert.encoder.blocks.5.attn.proj.bias of shape (768,) 2022-03-17 13:47:40,544.544 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.weight loaded from bert.encoder.blocks.5.attn.proj.weight of shape (768, 768) 2022-03-17 13:47:40,544.544 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.bias loaded from bert.encoder.blocks.5.attn.qkv.bias of shape (2304,) 2022-03-17 13:47:40,544.544 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.weight loaded from bert.encoder.blocks.5.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:47:40,545.545 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.bias loaded from bert.encoder.blocks.5.mlp.fc1.bias of shape (3072,) 2022-03-17 13:47:40,545.545 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.weight loaded from bert.encoder.blocks.5.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:47:40,545.545 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.bias loaded from bert.encoder.blocks.5.mlp.fc2.bias of shape (768,) 2022-03-17 13:47:40,545.545 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.weight loaded from bert.encoder.blocks.5.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:47:40,545.545 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.bias loaded from bert.encoder.blocks.5.norm1.bias of shape (768,) 2022-03-17 13:47:40,545.545 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.weight loaded from bert.encoder.blocks.5.norm1.weight of shape (768,) 2022-03-17 13:47:40,545.545 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.bias loaded from bert.encoder.blocks.5.norm2.bias of shape (768,) 2022-03-17 13:47:40,545.545 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.weight loaded from bert.encoder.blocks.5.norm2.weight of shape (768,) 2022-03-17 13:47:40,545.545 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.bias loaded from bert.encoder.blocks.6.attn.proj.bias of shape (768,) 2022-03-17 13:47:40,545.545 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.weight loaded from bert.encoder.blocks.6.attn.proj.weight of shape (768, 768) 2022-03-17 13:47:40,545.545 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.bias loaded from bert.encoder.blocks.6.attn.qkv.bias of shape (2304,) 2022-03-17 13:47:40,545.545 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.weight loaded from bert.encoder.blocks.6.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:47:40,545.545 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.bias loaded from bert.encoder.blocks.6.mlp.fc1.bias of shape (3072,) 2022-03-17 13:47:40,546.546 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.weight loaded from bert.encoder.blocks.6.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:47:40,546.546 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.bias loaded from bert.encoder.blocks.6.mlp.fc2.bias of shape (768,) 2022-03-17 13:47:40,546.546 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.weight loaded from bert.encoder.blocks.6.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:47:40,546.546 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.bias loaded from bert.encoder.blocks.6.norm1.bias of shape (768,) 2022-03-17 13:47:40,546.546 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.weight loaded from bert.encoder.blocks.6.norm1.weight of shape (768,) 2022-03-17 13:47:40,546.546 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.bias loaded from bert.encoder.blocks.6.norm2.bias of shape (768,) 2022-03-17 13:47:40,546.546 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.weight loaded from bert.encoder.blocks.6.norm2.weight of shape (768,) 2022-03-17 13:47:40,546.546 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.bias loaded from bert.encoder.blocks.7.attn.proj.bias of shape (768,) 2022-03-17 13:47:40,546.546 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.weight loaded from bert.encoder.blocks.7.attn.proj.weight of shape (768, 768) 2022-03-17 13:47:40,546.546 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.bias loaded from bert.encoder.blocks.7.attn.qkv.bias of shape (2304,) 2022-03-17 13:47:40,546.546 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.weight loaded from bert.encoder.blocks.7.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:47:40,546.546 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.bias loaded from bert.encoder.blocks.7.mlp.fc1.bias of shape (3072,) 2022-03-17 13:47:40,546.546 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.weight loaded from bert.encoder.blocks.7.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:47:40,546.546 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.bias loaded from bert.encoder.blocks.7.mlp.fc2.bias of shape (768,) 2022-03-17 13:47:40,547.547 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.weight loaded from bert.encoder.blocks.7.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:47:40,547.547 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.bias loaded from bert.encoder.blocks.7.norm1.bias of shape (768,) 2022-03-17 13:47:40,547.547 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.weight loaded from bert.encoder.blocks.7.norm1.weight of shape (768,) 2022-03-17 13:47:40,547.547 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.bias loaded from bert.encoder.blocks.7.norm2.bias of shape (768,) 2022-03-17 13:47:40,547.547 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.weight loaded from bert.encoder.blocks.7.norm2.weight of shape (768,) 2022-03-17 13:47:40,547.547 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.bias loaded from bert.encoder.blocks.8.attn.proj.bias of shape (768,) 2022-03-17 13:47:40,547.547 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.weight loaded from bert.encoder.blocks.8.attn.proj.weight of shape (768, 768) 2022-03-17 13:47:40,547.547 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.bias loaded from bert.encoder.blocks.8.attn.qkv.bias of shape (2304,) 2022-03-17 13:47:40,547.547 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.weight loaded from bert.encoder.blocks.8.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:47:40,547.547 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.bias loaded from bert.encoder.blocks.8.mlp.fc1.bias of shape (3072,) 2022-03-17 13:47:40,547.547 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.weight loaded from bert.encoder.blocks.8.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:47:40,547.547 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.bias loaded from bert.encoder.blocks.8.mlp.fc2.bias of shape (768,) 2022-03-17 13:47:40,547.547 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.weight loaded from bert.encoder.blocks.8.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:47:40,548.548 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.bias loaded from bert.encoder.blocks.8.norm1.bias of shape (768,) 2022-03-17 13:47:40,548.548 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.weight loaded from bert.encoder.blocks.8.norm1.weight of shape (768,) 2022-03-17 13:47:40,548.548 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.bias loaded from bert.encoder.blocks.8.norm2.bias of shape (768,) 2022-03-17 13:47:40,548.548 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.weight loaded from bert.encoder.blocks.8.norm2.weight of shape (768,) 2022-03-17 13:47:40,548.548 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.bias loaded from bert.encoder.blocks.9.attn.proj.bias of shape (768,) 2022-03-17 13:47:40,548.548 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.weight loaded from bert.encoder.blocks.9.attn.proj.weight of shape (768, 768) 2022-03-17 13:47:40,548.548 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.bias loaded from bert.encoder.blocks.9.attn.qkv.bias of shape (2304,) 2022-03-17 13:47:40,548.548 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.weight loaded from bert.encoder.blocks.9.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:47:40,548.548 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.bias loaded from bert.encoder.blocks.9.mlp.fc1.bias of shape (3072,) 2022-03-17 13:47:40,548.548 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.weight loaded from bert.encoder.blocks.9.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:47:40,548.548 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.bias loaded from bert.encoder.blocks.9.mlp.fc2.bias of shape (768,) 2022-03-17 13:47:40,548.548 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.weight loaded from bert.encoder.blocks.9.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:47:40,548.548 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.bias loaded from bert.encoder.blocks.9.norm1.bias of shape (768,) 2022-03-17 13:47:40,548.548 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.weight loaded from bert.encoder.blocks.9.norm1.weight of shape (768,) 2022-03-17 13:47:40,549.549 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.bias loaded from bert.encoder.blocks.9.norm2.bias of shape (768,) 2022-03-17 13:47:40,549.549 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.weight loaded from bert.encoder.blocks.9.norm2.weight of shape (768,) 2022-03-17 13:47:40,549.549 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.bias loaded from bert.encoder.tag_blocks.0.attn.proj.bias of shape (768,) 2022-03-17 13:47:40,549.549 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.weight loaded from bert.encoder.tag_blocks.0.attn.proj.weight of shape (768, 768) 2022-03-17 13:47:40,549.549 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.bias loaded from bert.encoder.tag_blocks.0.attn.qkv.bias of shape (2304,) 2022-03-17 13:47:40,549.549 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.weight loaded from bert.encoder.tag_blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:47:40,549.549 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.bias loaded from bert.encoder.tag_blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-17 13:47:40,549.549 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.weight loaded from bert.encoder.tag_blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:47:40,549.549 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.bias loaded from bert.encoder.tag_blocks.0.mlp.fc2.bias of shape (768,) 2022-03-17 13:47:40,549.549 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.weight loaded from bert.encoder.tag_blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:47:40,549.549 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.bias loaded from bert.encoder.tag_blocks.0.norm1.bias of shape (768,) 2022-03-17 13:47:40,549.549 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.weight loaded from bert.encoder.tag_blocks.0.norm1.weight of shape (768,) 2022-03-17 13:47:40,549.549 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.bias loaded from bert.encoder.tag_blocks.0.norm2.bias of shape (768,) 2022-03-17 13:47:40,550.550 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.weight loaded from bert.encoder.tag_blocks.0.norm2.weight of shape (768,) 2022-03-17 13:47:40,550.550 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.bias loaded from bert.encoder.tag_blocks.1.attn.proj.bias of shape (768,) 2022-03-17 13:47:40,550.550 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.weight loaded from bert.encoder.tag_blocks.1.attn.proj.weight of shape (768, 768) 2022-03-17 13:47:40,550.550 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.bias loaded from bert.encoder.tag_blocks.1.attn.qkv.bias of shape (2304,) 2022-03-17 13:47:40,550.550 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.weight loaded from bert.encoder.tag_blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:47:40,550.550 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.bias loaded from bert.encoder.tag_blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-17 13:47:40,550.550 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.weight loaded from bert.encoder.tag_blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:47:40,550.550 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.bias loaded from bert.encoder.tag_blocks.1.mlp.fc2.bias of shape (768,) 2022-03-17 13:47:40,550.550 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.weight loaded from bert.encoder.tag_blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:47:40,550.550 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.bias loaded from bert.encoder.tag_blocks.1.norm1.bias of shape (768,) 2022-03-17 13:47:40,550.550 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.weight loaded from bert.encoder.tag_blocks.1.norm1.weight of shape (768,) 2022-03-17 13:47:40,550.550 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.bias loaded from bert.encoder.tag_blocks.1.norm2.bias of shape (768,) 2022-03-17 13:47:40,550.550 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.weight loaded from bert.encoder.tag_blocks.1.norm2.weight of shape (768,) 2022-03-17 13:47:40,550.550 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.bias loaded from bert.encoder.tag_blocks.2.attn.proj.bias of shape (768,) 2022-03-17 13:47:40,551.551 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.weight loaded from bert.encoder.tag_blocks.2.attn.proj.weight of shape (768, 768) 2022-03-17 13:47:40,551.551 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.bias loaded from bert.encoder.tag_blocks.2.attn.qkv.bias of shape (2304,) 2022-03-17 13:47:40,551.551 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.weight loaded from bert.encoder.tag_blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:47:40,551.551 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.bias loaded from bert.encoder.tag_blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-17 13:47:40,551.551 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.weight loaded from bert.encoder.tag_blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:47:40,551.551 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.bias loaded from bert.encoder.tag_blocks.2.mlp.fc2.bias of shape (768,) 2022-03-17 13:47:40,551.551 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.weight loaded from bert.encoder.tag_blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:47:40,551.551 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.bias loaded from bert.encoder.tag_blocks.2.norm1.bias of shape (768,) 2022-03-17 13:47:40,551.551 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.weight loaded from bert.encoder.tag_blocks.2.norm1.weight of shape (768,) 2022-03-17 13:47:40,551.551 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.bias loaded from bert.encoder.tag_blocks.2.norm2.bias of shape (768,) 2022-03-17 13:47:40,551.551 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.weight loaded from bert.encoder.tag_blocks.2.norm2.weight of shape (768,) 2022-03-17 13:47:40,551.551 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.bias loaded from bert.encoder.tag_blocks.3.attn.proj.bias of shape (768,) 2022-03-17 13:47:40,551.551 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.weight loaded from bert.encoder.tag_blocks.3.attn.proj.weight of shape (768, 768) 2022-03-17 13:47:40,551.551 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.bias loaded from bert.encoder.tag_blocks.3.attn.qkv.bias of shape (2304,) 2022-03-17 13:47:40,552.552 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.weight loaded from bert.encoder.tag_blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:47:40,552.552 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.bias loaded from bert.encoder.tag_blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-17 13:47:40,552.552 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.weight loaded from bert.encoder.tag_blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:47:40,552.552 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.bias loaded from bert.encoder.tag_blocks.3.mlp.fc2.bias of shape (768,) 2022-03-17 13:47:40,552.552 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.weight loaded from bert.encoder.tag_blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:47:40,552.552 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.bias loaded from bert.encoder.tag_blocks.3.norm1.bias of shape (768,) 2022-03-17 13:47:40,552.552 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.weight loaded from bert.encoder.tag_blocks.3.norm1.weight of shape (768,) 2022-03-17 13:47:40,552.552 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.bias loaded from bert.encoder.tag_blocks.3.norm2.bias of shape (768,) 2022-03-17 13:47:40,552.552 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.weight loaded from bert.encoder.tag_blocks.3.norm2.weight of shape (768,) 2022-03-17 13:47:40,552.552 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.bias loaded from bert.extra_embeddings.LayerNorm.bias of shape (768,) 2022-03-17 13:47:40,552.552 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.weight loaded from bert.extra_embeddings.LayerNorm.weight of shape (768,) 2022-03-17 13:47:40,552.552 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.position_embeddings.weight loaded from bert.extra_embeddings.position_embeddings.weight of shape (512, 768) 2022-03-17 13:47:40,552.552 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.token_type_embeddings.weight loaded from bert.extra_embeddings.token_type_embeddings.weight of shape (2, 768) 2022-03-17 13:47:40,553.553 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.word_embeddings.weight loaded from bert.extra_embeddings.word_embeddings.weight of shape (30522, 768) 2022-03-17 13:47:40,553.553 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.bias loaded from bert.pooler.dense.bias of shape (768,) 2022-03-17 13:47:40,553.553 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.weight loaded from bert.pooler.dense.weight of shape (768, 768) 2022-03-17 13:47:40,553.553 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.bias loaded from bert.tag_logit.predictions.bias of shape (30522,) 2022-03-17 13:47:40,553.553 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.decoder.weight loaded from bert.tag_logit.predictions.decoder.weight of shape (30522, 768) 2022-03-17 13:47:40,553.553 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.bias loaded from bert.tag_logit.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-17 13:47:40,553.553 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.weight loaded from bert.tag_logit.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-17 13:47:40,553.553 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.bias loaded from bert.tag_logit.predictions.transform.dense.bias of shape (768,) 2022-03-17 13:47:40,553.553 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.weight loaded from bert.tag_logit.predictions.transform.dense.weight of shape (768, 768) 2022-03-17 13:47:40,553.553 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.bias loaded from cls.predictions.bias of shape (30522,) 2022-03-17 13:47:40,553.553 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.decoder.weight loaded from cls.predictions.decoder.weight of shape (30522, 768) 2022-03-17 13:47:40,553.553 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.bias loaded from cls.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-17 13:47:40,553.553 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.weight loaded from cls.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-17 13:47:40,553.553 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.bias loaded from cls.predictions.transform.dense.bias of shape (768,) 2022-03-17 13:47:40,554.554 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.weight loaded from cls.predictions.transform.dense.weight of shape (768, 768) 2022-03-17 13:47:40,554.554 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 288; loaded = 288 2022-03-17 13:47:40,554.554 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = [] 2022-03-17 13:47:40,557.557 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = ['module.cls.predictions.decoder.weight']; total = 1 2022-03-17 13:47:40,725.725 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.cls.predictions.decoder.weight'] 2022-03-17 13:47:41,050.050 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = ; data = TaxCocoCaption; split = test; add_key = False; backend = cv; hold_buffer = 0; save_original = False 2022-03-17 13:47:41,076.076 2829:uni_pipeline.py:509 get_data_loader(): sampler = 2022-03-17 13:47:41,076.076 2829:uni_pipeline.py:1000 predict(): writing output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0030000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv 2022-03-17 13:47:41,148.148 2829:uni_pipeline.py:907 predict_iter(): DatasetPlusTransform(dataset=, transform=Compose( Compose( ImageTransform2Dict(image_transform=Compose( ToPILImage() Resize(size=384, interpolation=PIL.Image.BICUBIC) CenterCrop(size=(384, 384)) ToTensor() Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) )) ) LoadLabel(data=TaxCocoCaption, split=test, version=vinvl) TransCaptionTensorizer(tensorizer=, pad_to_max=True, pad_image_to_max=True) )) uni_pipeline.py:908: 0%| | 0/10 [00:00 locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi pip.monitor_train() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train need_wait_models = self.pred_eval_intermediate_models() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models pred = self.ensure_predict(model_file=model_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict self.predict(model_file, predict_result_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict model = self.get_model(is_train=False) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. --- Logging error --- Traceback (most recent call last): File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit msg = self.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format return fmt.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format record.message = record.getMessage() File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage msg = str(self.msg) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__ return str(self.to_json_string()) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n" File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps **kw).encode(obj) File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode chunks = list(chunks) File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode yield from _iterencode_dict(o, _current_indent_level) File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict yield from chunks File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode o = _default(o) File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default raise TypeError(f'Object of type {o.__class__.__name__} ' TypeError: Object of type BertTokenizer is not JSON serializable Call stack: File "src/qd/pipeline.py", line 1368, in locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi pip.monitor_train() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train need_wait_models = self.pred_eval_intermediate_models() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models pred = self.ensure_predict(model_file=model_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict self.predict(model_file, predict_result_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict model = self.get_model(is_train=False) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. 2022-03-17 13:49:53,005.005 2829:modeling_bert.py:529 __init__(): TIMM Split image encoder load from pre-trained: True 2022-03-17 13:49:54,070.070 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:49:55,610.610 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:49:56,259.259 2829:modeling_bert.py:2677 __init__(): BertImgModel Image Dimension: 2054 2022-03-17 13:49:56,407.407 2829:modeling_bert.py:2688 __init__(): BertImgModel Image Dimension: 2054 2022-03-17 13:49:57,197.197 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight! Note that this might be replaced by pre-trained checkpoint later! 2022-03-17 13:49:57,197.197 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode. 2022-03-17 13:49:58,335.335 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:49:58,876.876 2829:checkpoint.py:240 load(): Loading checkpoint from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt 2022-03-17 13:50:07,173.173 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.cls_token loaded from image_encoder.module.cls_token of shape (1, 1, 768) 2022-03-17 13:50:07,174.174 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.bias loaded from image_encoder.module.head.bias of shape (1000,) 2022-03-17 13:50:07,174.174 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.weight loaded from image_encoder.module.head.weight of shape (1000, 768) 2022-03-17 13:50:07,174.174 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.bias loaded from image_encoder.module.patch_embed.proj.bias of shape (768,) 2022-03-17 13:50:07,174.174 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.weight loaded from image_encoder.module.patch_embed.proj.weight of shape (768, 3, 16, 16) 2022-03-17 13:50:07,174.174 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.pos_embed loaded from image_encoder.module.pos_embed of shape (1, 577, 768) 2022-03-17 13:50:07,174.174 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.bias loaded from bert.caption_pooler.dense.bias of shape (768,) 2022-03-17 13:50:07,174.174 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.weight loaded from bert.caption_pooler.dense.weight of shape (768, 768) 2022-03-17 13:50:07,174.174 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.bias loaded from bert.decoder.layer.0.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:50:07,175.175 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.weight loaded from bert.decoder.layer.0.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:50:07,175.175 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.bias loaded from bert.decoder.layer.0.attention.output.dense.bias of shape (768,) 2022-03-17 13:50:07,175.175 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.weight loaded from bert.decoder.layer.0.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:50:07,175.175 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.bias loaded from bert.decoder.layer.0.attention.self.key.bias of shape (768,) 2022-03-17 13:50:07,175.175 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.weight loaded from bert.decoder.layer.0.attention.self.key.weight of shape (768, 768) 2022-03-17 13:50:07,175.175 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.bias loaded from bert.decoder.layer.0.attention.self.query.bias of shape (768,) 2022-03-17 13:50:07,175.175 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.weight loaded from bert.decoder.layer.0.attention.self.query.weight of shape (768, 768) 2022-03-17 13:50:07,175.175 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.bias loaded from bert.decoder.layer.0.attention.self.value.bias of shape (768,) 2022-03-17 13:50:07,175.175 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.weight loaded from bert.decoder.layer.0.attention.self.value.weight of shape (768, 768) 2022-03-17 13:50:07,175.175 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.bias loaded from bert.decoder.layer.0.intermediate.dense.bias of shape (3072,) 2022-03-17 13:50:07,175.175 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.weight loaded from bert.decoder.layer.0.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:50:07,175.175 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.bias loaded from bert.decoder.layer.0.output.LayerNorm.bias of shape (768,) 2022-03-17 13:50:07,175.175 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.weight loaded from bert.decoder.layer.0.output.LayerNorm.weight of shape (768,) 2022-03-17 13:50:07,175.175 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.bias loaded from bert.decoder.layer.0.output.dense.bias of shape (768,) 2022-03-17 13:50:07,176.176 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.weight loaded from bert.decoder.layer.0.output.dense.weight of shape (768, 3072) 2022-03-17 13:50:07,176.176 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.bias loaded from bert.decoder.layer.1.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:50:07,176.176 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.weight loaded from bert.decoder.layer.1.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:50:07,176.176 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.bias loaded from bert.decoder.layer.1.attention.output.dense.bias of shape (768,) 2022-03-17 13:50:07,176.176 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.weight loaded from bert.decoder.layer.1.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:50:07,176.176 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.bias loaded from bert.decoder.layer.1.attention.self.key.bias of shape (768,) 2022-03-17 13:50:07,176.176 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.weight loaded from bert.decoder.layer.1.attention.self.key.weight of shape (768, 768) 2022-03-17 13:50:07,176.176 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.bias loaded from bert.decoder.layer.1.attention.self.query.bias of shape (768,) 2022-03-17 13:50:07,176.176 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.weight loaded from bert.decoder.layer.1.attention.self.query.weight of shape (768, 768) 2022-03-17 13:50:07,176.176 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.bias loaded from bert.decoder.layer.1.attention.self.value.bias of shape (768,) 2022-03-17 13:50:07,176.176 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.weight loaded from bert.decoder.layer.1.attention.self.value.weight of shape (768, 768) 2022-03-17 13:50:07,176.176 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.bias loaded from bert.decoder.layer.1.intermediate.dense.bias of shape (3072,) 2022-03-17 13:50:07,176.176 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.weight loaded from bert.decoder.layer.1.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:50:07,177.177 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.bias loaded from bert.decoder.layer.1.output.LayerNorm.bias of shape (768,) 2022-03-17 13:50:07,177.177 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.weight loaded from bert.decoder.layer.1.output.LayerNorm.weight of shape (768,) 2022-03-17 13:50:07,177.177 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.bias loaded from bert.decoder.layer.1.output.dense.bias of shape (768,) 2022-03-17 13:50:07,177.177 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.weight loaded from bert.decoder.layer.1.output.dense.weight of shape (768, 3072) 2022-03-17 13:50:07,177.177 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.bias loaded from bert.decoder.layer.2.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:50:07,177.177 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.weight loaded from bert.decoder.layer.2.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:50:07,177.177 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.bias loaded from bert.decoder.layer.2.attention.output.dense.bias of shape (768,) 2022-03-17 13:50:07,177.177 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.weight loaded from bert.decoder.layer.2.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:50:07,177.177 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.bias loaded from bert.decoder.layer.2.attention.self.key.bias of shape (768,) 2022-03-17 13:50:07,177.177 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.weight loaded from bert.decoder.layer.2.attention.self.key.weight of shape (768, 768) 2022-03-17 13:50:07,177.177 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.bias loaded from bert.decoder.layer.2.attention.self.query.bias of shape (768,) 2022-03-17 13:50:07,177.177 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.weight loaded from bert.decoder.layer.2.attention.self.query.weight of shape (768, 768) 2022-03-17 13:50:07,177.177 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.bias loaded from bert.decoder.layer.2.attention.self.value.bias of shape (768,) 2022-03-17 13:50:07,177.177 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.weight loaded from bert.decoder.layer.2.attention.self.value.weight of shape (768, 768) 2022-03-17 13:50:07,178.178 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.bias loaded from bert.decoder.layer.2.intermediate.dense.bias of shape (3072,) 2022-03-17 13:50:07,178.178 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.weight loaded from bert.decoder.layer.2.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:50:07,178.178 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.bias loaded from bert.decoder.layer.2.output.LayerNorm.bias of shape (768,) 2022-03-17 13:50:07,178.178 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.weight loaded from bert.decoder.layer.2.output.LayerNorm.weight of shape (768,) 2022-03-17 13:50:07,178.178 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.bias loaded from bert.decoder.layer.2.output.dense.bias of shape (768,) 2022-03-17 13:50:07,178.178 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.weight loaded from bert.decoder.layer.2.output.dense.weight of shape (768, 3072) 2022-03-17 13:50:07,178.178 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.bias loaded from bert.decoder.layer.3.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:50:07,178.178 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.weight loaded from bert.decoder.layer.3.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:50:07,178.178 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.bias loaded from bert.decoder.layer.3.attention.output.dense.bias of shape (768,) 2022-03-17 13:50:07,178.178 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.weight loaded from bert.decoder.layer.3.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:50:07,178.178 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.bias loaded from bert.decoder.layer.3.attention.self.key.bias of shape (768,) 2022-03-17 13:50:07,179.179 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.weight loaded from bert.decoder.layer.3.attention.self.key.weight of shape (768, 768) 2022-03-17 13:50:07,179.179 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.bias loaded from bert.decoder.layer.3.attention.self.query.bias of shape (768,) 2022-03-17 13:50:07,179.179 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.weight loaded from bert.decoder.layer.3.attention.self.query.weight of shape (768, 768) 2022-03-17 13:50:07,179.179 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.bias loaded from bert.decoder.layer.3.attention.self.value.bias of shape (768,) 2022-03-17 13:50:07,179.179 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.weight loaded from bert.decoder.layer.3.attention.self.value.weight of shape (768, 768) 2022-03-17 13:50:07,179.179 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.bias loaded from bert.decoder.layer.3.intermediate.dense.bias of shape (3072,) 2022-03-17 13:50:07,179.179 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.weight loaded from bert.decoder.layer.3.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:50:07,179.179 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.bias loaded from bert.decoder.layer.3.output.LayerNorm.bias of shape (768,) 2022-03-17 13:50:07,179.179 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.weight loaded from bert.decoder.layer.3.output.LayerNorm.weight of shape (768,) 2022-03-17 13:50:07,179.179 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.bias loaded from bert.decoder.layer.3.output.dense.bias of shape (768,) 2022-03-17 13:50:07,179.179 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.weight loaded from bert.decoder.layer.3.output.dense.weight of shape (768, 3072) 2022-03-17 13:50:07,179.179 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.bias loaded from bert.embeddings.LayerNorm.bias of shape (768,) 2022-03-17 13:50:07,179.179 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.weight loaded from bert.embeddings.LayerNorm.weight of shape (768,) 2022-03-17 13:50:07,179.179 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.position_embeddings.weight loaded from bert.embeddings.position_embeddings.weight of shape (512, 768) 2022-03-17 13:50:07,180.180 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.token_type_embeddings.weight loaded from bert.embeddings.token_type_embeddings.weight of shape (2, 768) 2022-03-17 13:50:07,180.180 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.word_embeddings.weight loaded from bert.embeddings.word_embeddings.weight of shape (30522, 768) 2022-03-17 13:50:07,180.180 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.bias loaded from bert.encoder.blocks.0.attn.proj.bias of shape (768,) 2022-03-17 13:50:07,180.180 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.weight loaded from bert.encoder.blocks.0.attn.proj.weight of shape (768, 768) 2022-03-17 13:50:07,180.180 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.bias loaded from bert.encoder.blocks.0.attn.qkv.bias of shape (2304,) 2022-03-17 13:50:07,180.180 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.weight loaded from bert.encoder.blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:50:07,180.180 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.bias loaded from bert.encoder.blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-17 13:50:07,180.180 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.weight loaded from bert.encoder.blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:50:07,180.180 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.bias loaded from bert.encoder.blocks.0.mlp.fc2.bias of shape (768,) 2022-03-17 13:50:07,180.180 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.weight loaded from bert.encoder.blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:50:07,180.180 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.bias loaded from bert.encoder.blocks.0.norm1.bias of shape (768,) 2022-03-17 13:50:07,181.181 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.weight loaded from bert.encoder.blocks.0.norm1.weight of shape (768,) 2022-03-17 13:50:07,181.181 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.bias loaded from bert.encoder.blocks.0.norm2.bias of shape (768,) 2022-03-17 13:50:07,181.181 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.weight loaded from bert.encoder.blocks.0.norm2.weight of shape (768,) 2022-03-17 13:50:07,181.181 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.bias loaded from bert.encoder.blocks.1.attn.proj.bias of shape (768,) 2022-03-17 13:50:07,181.181 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.weight loaded from bert.encoder.blocks.1.attn.proj.weight of shape (768, 768) 2022-03-17 13:50:07,181.181 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.bias loaded from bert.encoder.blocks.1.attn.qkv.bias of shape (2304,) 2022-03-17 13:50:07,181.181 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.weight loaded from bert.encoder.blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:50:07,181.181 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.bias loaded from bert.encoder.blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-17 13:50:07,181.181 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.weight loaded from bert.encoder.blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:50:07,181.181 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.bias loaded from bert.encoder.blocks.1.mlp.fc2.bias of shape (768,) 2022-03-17 13:50:07,181.181 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.weight loaded from bert.encoder.blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:50:07,181.181 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.bias loaded from bert.encoder.blocks.1.norm1.bias of shape (768,) 2022-03-17 13:50:07,181.181 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.weight loaded from bert.encoder.blocks.1.norm1.weight of shape (768,) 2022-03-17 13:50:07,182.182 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.bias loaded from bert.encoder.blocks.1.norm2.bias of shape (768,) 2022-03-17 13:50:07,182.182 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.weight loaded from bert.encoder.blocks.1.norm2.weight of shape (768,) 2022-03-17 13:50:07,182.182 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.bias loaded from bert.encoder.blocks.10.attn.proj.bias of shape (768,) 2022-03-17 13:50:07,182.182 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.weight loaded from bert.encoder.blocks.10.attn.proj.weight of shape (768, 768) 2022-03-17 13:50:07,182.182 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.bias loaded from bert.encoder.blocks.10.attn.qkv.bias of shape (2304,) 2022-03-17 13:50:07,182.182 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.weight loaded from bert.encoder.blocks.10.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:50:07,182.182 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.bias loaded from bert.encoder.blocks.10.mlp.fc1.bias of shape (3072,) 2022-03-17 13:50:07,182.182 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.weight loaded from bert.encoder.blocks.10.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:50:07,182.182 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.bias loaded from bert.encoder.blocks.10.mlp.fc2.bias of shape (768,) 2022-03-17 13:50:07,182.182 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.weight loaded from bert.encoder.blocks.10.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:50:07,182.182 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.bias loaded from bert.encoder.blocks.10.norm1.bias of shape (768,) 2022-03-17 13:50:07,182.182 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.weight loaded from bert.encoder.blocks.10.norm1.weight of shape (768,) 2022-03-17 13:50:07,182.182 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.bias loaded from bert.encoder.blocks.10.norm2.bias of shape (768,) 2022-03-17 13:50:07,182.182 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.weight loaded from bert.encoder.blocks.10.norm2.weight of shape (768,) 2022-03-17 13:50:07,183.183 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.bias loaded from bert.encoder.blocks.11.attn.proj.bias of shape (768,) 2022-03-17 13:50:07,183.183 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.weight loaded from bert.encoder.blocks.11.attn.proj.weight of shape (768, 768) 2022-03-17 13:50:07,183.183 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.bias loaded from bert.encoder.blocks.11.attn.qkv.bias of shape (2304,) 2022-03-17 13:50:07,183.183 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.weight loaded from bert.encoder.blocks.11.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:50:07,183.183 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.bias loaded from bert.encoder.blocks.11.mlp.fc1.bias of shape (3072,) 2022-03-17 13:50:07,183.183 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.weight loaded from bert.encoder.blocks.11.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:50:07,183.183 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.bias loaded from bert.encoder.blocks.11.mlp.fc2.bias of shape (768,) 2022-03-17 13:50:07,183.183 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.weight loaded from bert.encoder.blocks.11.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:50:07,183.183 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.bias loaded from bert.encoder.blocks.11.norm1.bias of shape (768,) 2022-03-17 13:50:07,183.183 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.weight loaded from bert.encoder.blocks.11.norm1.weight of shape (768,) 2022-03-17 13:50:07,183.183 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.bias loaded from bert.encoder.blocks.11.norm2.bias of shape (768,) 2022-03-17 13:50:07,183.183 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.weight loaded from bert.encoder.blocks.11.norm2.weight of shape (768,) 2022-03-17 13:50:07,183.183 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.bias loaded from bert.encoder.blocks.2.attn.proj.bias of shape (768,) 2022-03-17 13:50:07,183.183 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.weight loaded from bert.encoder.blocks.2.attn.proj.weight of shape (768, 768) 2022-03-17 13:50:07,184.184 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.bias loaded from bert.encoder.blocks.2.attn.qkv.bias of shape (2304,) 2022-03-17 13:50:07,184.184 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.weight loaded from bert.encoder.blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:50:07,184.184 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.bias loaded from bert.encoder.blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-17 13:50:07,184.184 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.weight loaded from bert.encoder.blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:50:07,184.184 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.bias loaded from bert.encoder.blocks.2.mlp.fc2.bias of shape (768,) 2022-03-17 13:50:07,184.184 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.weight loaded from bert.encoder.blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:50:07,184.184 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.bias loaded from bert.encoder.blocks.2.norm1.bias of shape (768,) 2022-03-17 13:50:07,184.184 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.weight loaded from bert.encoder.blocks.2.norm1.weight of shape (768,) 2022-03-17 13:50:07,184.184 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.bias loaded from bert.encoder.blocks.2.norm2.bias of shape (768,) 2022-03-17 13:50:07,184.184 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.weight loaded from bert.encoder.blocks.2.norm2.weight of shape (768,) 2022-03-17 13:50:07,184.184 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.bias loaded from bert.encoder.blocks.3.attn.proj.bias of shape (768,) 2022-03-17 13:50:07,184.184 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.weight loaded from bert.encoder.blocks.3.attn.proj.weight of shape (768, 768) 2022-03-17 13:50:07,184.184 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.bias loaded from bert.encoder.blocks.3.attn.qkv.bias of shape (2304,) 2022-03-17 13:50:07,185.185 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.weight loaded from bert.encoder.blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:50:07,185.185 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.bias loaded from bert.encoder.blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-17 13:50:07,185.185 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.weight loaded from bert.encoder.blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:50:07,185.185 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.bias loaded from bert.encoder.blocks.3.mlp.fc2.bias of shape (768,) 2022-03-17 13:50:07,185.185 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.weight loaded from bert.encoder.blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:50:07,185.185 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.bias loaded from bert.encoder.blocks.3.norm1.bias of shape (768,) 2022-03-17 13:50:07,185.185 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.weight loaded from bert.encoder.blocks.3.norm1.weight of shape (768,) 2022-03-17 13:50:07,185.185 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.bias loaded from bert.encoder.blocks.3.norm2.bias of shape (768,) 2022-03-17 13:50:07,185.185 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.weight loaded from bert.encoder.blocks.3.norm2.weight of shape (768,) 2022-03-17 13:50:07,185.185 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.bias loaded from bert.encoder.blocks.4.attn.proj.bias of shape (768,) 2022-03-17 13:50:07,185.185 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.weight loaded from bert.encoder.blocks.4.attn.proj.weight of shape (768, 768) 2022-03-17 13:50:07,185.185 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.bias loaded from bert.encoder.blocks.4.attn.qkv.bias of shape (2304,) 2022-03-17 13:50:07,185.185 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.weight loaded from bert.encoder.blocks.4.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:50:07,186.186 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.bias loaded from bert.encoder.blocks.4.mlp.fc1.bias of shape (3072,) 2022-03-17 13:50:07,186.186 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.weight loaded from bert.encoder.blocks.4.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:50:07,186.186 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.bias loaded from bert.encoder.blocks.4.mlp.fc2.bias of shape (768,) 2022-03-17 13:50:07,186.186 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.weight loaded from bert.encoder.blocks.4.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:50:07,186.186 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.bias loaded from bert.encoder.blocks.4.norm1.bias of shape (768,) 2022-03-17 13:50:07,186.186 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.weight loaded from bert.encoder.blocks.4.norm1.weight of shape (768,) 2022-03-17 13:50:07,186.186 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.bias loaded from bert.encoder.blocks.4.norm2.bias of shape (768,) 2022-03-17 13:50:07,186.186 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.weight loaded from bert.encoder.blocks.4.norm2.weight of shape (768,) 2022-03-17 13:50:07,186.186 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.bias loaded from bert.encoder.blocks.5.attn.proj.bias of shape (768,) 2022-03-17 13:50:07,186.186 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.weight loaded from bert.encoder.blocks.5.attn.proj.weight of shape (768, 768) 2022-03-17 13:50:07,186.186 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.bias loaded from bert.encoder.blocks.5.attn.qkv.bias of shape (2304,) 2022-03-17 13:50:07,186.186 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.weight loaded from bert.encoder.blocks.5.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:50:07,186.186 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.bias loaded from bert.encoder.blocks.5.mlp.fc1.bias of shape (3072,) 2022-03-17 13:50:07,186.186 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.weight loaded from bert.encoder.blocks.5.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:50:07,187.187 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.bias loaded from bert.encoder.blocks.5.mlp.fc2.bias of shape (768,) 2022-03-17 13:50:07,187.187 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.weight loaded from bert.encoder.blocks.5.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:50:07,187.187 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.bias loaded from bert.encoder.blocks.5.norm1.bias of shape (768,) 2022-03-17 13:50:07,187.187 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.weight loaded from bert.encoder.blocks.5.norm1.weight of shape (768,) 2022-03-17 13:50:07,187.187 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.bias loaded from bert.encoder.blocks.5.norm2.bias of shape (768,) 2022-03-17 13:50:07,187.187 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.weight loaded from bert.encoder.blocks.5.norm2.weight of shape (768,) 2022-03-17 13:50:07,187.187 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.bias loaded from bert.encoder.blocks.6.attn.proj.bias of shape (768,) 2022-03-17 13:50:07,187.187 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.weight loaded from bert.encoder.blocks.6.attn.proj.weight of shape (768, 768) 2022-03-17 13:50:07,187.187 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.bias loaded from bert.encoder.blocks.6.attn.qkv.bias of shape (2304,) 2022-03-17 13:50:07,187.187 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.weight loaded from bert.encoder.blocks.6.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:50:07,187.187 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.bias loaded from bert.encoder.blocks.6.mlp.fc1.bias of shape (3072,) 2022-03-17 13:50:07,187.187 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.weight loaded from bert.encoder.blocks.6.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:50:07,187.187 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.bias loaded from bert.encoder.blocks.6.mlp.fc2.bias of shape (768,) 2022-03-17 13:50:07,187.187 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.weight loaded from bert.encoder.blocks.6.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:50:07,188.188 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.bias loaded from bert.encoder.blocks.6.norm1.bias of shape (768,) 2022-03-17 13:50:07,188.188 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.weight loaded from bert.encoder.blocks.6.norm1.weight of shape (768,) 2022-03-17 13:50:07,188.188 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.bias loaded from bert.encoder.blocks.6.norm2.bias of shape (768,) 2022-03-17 13:50:07,188.188 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.weight loaded from bert.encoder.blocks.6.norm2.weight of shape (768,) 2022-03-17 13:50:07,188.188 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.bias loaded from bert.encoder.blocks.7.attn.proj.bias of shape (768,) 2022-03-17 13:50:07,188.188 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.weight loaded from bert.encoder.blocks.7.attn.proj.weight of shape (768, 768) 2022-03-17 13:50:07,188.188 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.bias loaded from bert.encoder.blocks.7.attn.qkv.bias of shape (2304,) 2022-03-17 13:50:07,188.188 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.weight loaded from bert.encoder.blocks.7.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:50:07,188.188 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.bias loaded from bert.encoder.blocks.7.mlp.fc1.bias of shape (3072,) 2022-03-17 13:50:07,188.188 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.weight loaded from bert.encoder.blocks.7.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:50:07,188.188 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.bias loaded from bert.encoder.blocks.7.mlp.fc2.bias of shape (768,) 2022-03-17 13:50:07,188.188 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.weight loaded from bert.encoder.blocks.7.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:50:07,188.188 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.bias loaded from bert.encoder.blocks.7.norm1.bias of shape (768,) 2022-03-17 13:50:07,189.189 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.weight loaded from bert.encoder.blocks.7.norm1.weight of shape (768,) 2022-03-17 13:50:07,189.189 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.bias loaded from bert.encoder.blocks.7.norm2.bias of shape (768,) 2022-03-17 13:50:07,189.189 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.weight loaded from bert.encoder.blocks.7.norm2.weight of shape (768,) 2022-03-17 13:50:07,189.189 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.bias loaded from bert.encoder.blocks.8.attn.proj.bias of shape (768,) 2022-03-17 13:50:07,189.189 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.weight loaded from bert.encoder.blocks.8.attn.proj.weight of shape (768, 768) 2022-03-17 13:50:07,189.189 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.bias loaded from bert.encoder.blocks.8.attn.qkv.bias of shape (2304,) 2022-03-17 13:50:07,189.189 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.weight loaded from bert.encoder.blocks.8.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:50:07,189.189 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.bias loaded from bert.encoder.blocks.8.mlp.fc1.bias of shape (3072,) 2022-03-17 13:50:07,189.189 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.weight loaded from bert.encoder.blocks.8.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:50:07,189.189 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.bias loaded from bert.encoder.blocks.8.mlp.fc2.bias of shape (768,) 2022-03-17 13:50:07,189.189 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.weight loaded from bert.encoder.blocks.8.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:50:07,189.189 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.bias loaded from bert.encoder.blocks.8.norm1.bias of shape (768,) 2022-03-17 13:50:07,189.189 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.weight loaded from bert.encoder.blocks.8.norm1.weight of shape (768,) 2022-03-17 13:50:07,189.189 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.bias loaded from bert.encoder.blocks.8.norm2.bias of shape (768,) 2022-03-17 13:50:07,190.190 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.weight loaded from bert.encoder.blocks.8.norm2.weight of shape (768,) 2022-03-17 13:50:07,190.190 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.bias loaded from bert.encoder.blocks.9.attn.proj.bias of shape (768,) 2022-03-17 13:50:07,190.190 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.weight loaded from bert.encoder.blocks.9.attn.proj.weight of shape (768, 768) 2022-03-17 13:50:07,190.190 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.bias loaded from bert.encoder.blocks.9.attn.qkv.bias of shape (2304,) 2022-03-17 13:50:07,190.190 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.weight loaded from bert.encoder.blocks.9.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:50:07,190.190 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.bias loaded from bert.encoder.blocks.9.mlp.fc1.bias of shape (3072,) 2022-03-17 13:50:07,190.190 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.weight loaded from bert.encoder.blocks.9.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:50:07,190.190 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.bias loaded from bert.encoder.blocks.9.mlp.fc2.bias of shape (768,) 2022-03-17 13:50:07,190.190 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.weight loaded from bert.encoder.blocks.9.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:50:07,190.190 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.bias loaded from bert.encoder.blocks.9.norm1.bias of shape (768,) 2022-03-17 13:50:07,190.190 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.weight loaded from bert.encoder.blocks.9.norm1.weight of shape (768,) 2022-03-17 13:50:07,190.190 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.bias loaded from bert.encoder.blocks.9.norm2.bias of shape (768,) 2022-03-17 13:50:07,190.190 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.weight loaded from bert.encoder.blocks.9.norm2.weight of shape (768,) 2022-03-17 13:50:07,191.191 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.bias loaded from bert.encoder.tag_blocks.0.attn.proj.bias of shape (768,) 2022-03-17 13:50:07,191.191 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.weight loaded from bert.encoder.tag_blocks.0.attn.proj.weight of shape (768, 768) 2022-03-17 13:50:07,191.191 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.bias loaded from bert.encoder.tag_blocks.0.attn.qkv.bias of shape (2304,) 2022-03-17 13:50:07,191.191 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.weight loaded from bert.encoder.tag_blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:50:07,191.191 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.bias loaded from bert.encoder.tag_blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-17 13:50:07,191.191 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.weight loaded from bert.encoder.tag_blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:50:07,191.191 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.bias loaded from bert.encoder.tag_blocks.0.mlp.fc2.bias of shape (768,) 2022-03-17 13:50:07,191.191 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.weight loaded from bert.encoder.tag_blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:50:07,191.191 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.bias loaded from bert.encoder.tag_blocks.0.norm1.bias of shape (768,) 2022-03-17 13:50:07,191.191 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.weight loaded from bert.encoder.tag_blocks.0.norm1.weight of shape (768,) 2022-03-17 13:50:07,191.191 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.bias loaded from bert.encoder.tag_blocks.0.norm2.bias of shape (768,) 2022-03-17 13:50:07,191.191 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.weight loaded from bert.encoder.tag_blocks.0.norm2.weight of shape (768,) 2022-03-17 13:50:07,191.191 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.bias loaded from bert.encoder.tag_blocks.1.attn.proj.bias of shape (768,) 2022-03-17 13:50:07,191.191 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.weight loaded from bert.encoder.tag_blocks.1.attn.proj.weight of shape (768, 768) 2022-03-17 13:50:07,192.192 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.bias loaded from bert.encoder.tag_blocks.1.attn.qkv.bias of shape (2304,) 2022-03-17 13:50:07,192.192 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.weight loaded from bert.encoder.tag_blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:50:07,192.192 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.bias loaded from bert.encoder.tag_blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-17 13:50:07,192.192 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.weight loaded from bert.encoder.tag_blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:50:07,192.192 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.bias loaded from bert.encoder.tag_blocks.1.mlp.fc2.bias of shape (768,) 2022-03-17 13:50:07,192.192 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.weight loaded from bert.encoder.tag_blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:50:07,192.192 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.bias loaded from bert.encoder.tag_blocks.1.norm1.bias of shape (768,) 2022-03-17 13:50:07,192.192 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.weight loaded from bert.encoder.tag_blocks.1.norm1.weight of shape (768,) 2022-03-17 13:50:07,192.192 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.bias loaded from bert.encoder.tag_blocks.1.norm2.bias of shape (768,) 2022-03-17 13:50:07,193.193 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.weight loaded from bert.encoder.tag_blocks.1.norm2.weight of shape (768,) 2022-03-17 13:50:07,193.193 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.bias loaded from bert.encoder.tag_blocks.2.attn.proj.bias of shape (768,) 2022-03-17 13:50:07,193.193 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.weight loaded from bert.encoder.tag_blocks.2.attn.proj.weight of shape (768, 768) 2022-03-17 13:50:07,193.193 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.bias loaded from bert.encoder.tag_blocks.2.attn.qkv.bias of shape (2304,) 2022-03-17 13:50:07,193.193 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.weight loaded from bert.encoder.tag_blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:50:07,193.193 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.bias loaded from bert.encoder.tag_blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-17 13:50:07,193.193 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.weight loaded from bert.encoder.tag_blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:50:07,193.193 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.bias loaded from bert.encoder.tag_blocks.2.mlp.fc2.bias of shape (768,) 2022-03-17 13:50:07,193.193 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.weight loaded from bert.encoder.tag_blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:50:07,193.193 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.bias loaded from bert.encoder.tag_blocks.2.norm1.bias of shape (768,) 2022-03-17 13:50:07,193.193 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.weight loaded from bert.encoder.tag_blocks.2.norm1.weight of shape (768,) 2022-03-17 13:50:07,193.193 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.bias loaded from bert.encoder.tag_blocks.2.norm2.bias of shape (768,) 2022-03-17 13:50:07,193.193 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.weight loaded from bert.encoder.tag_blocks.2.norm2.weight of shape (768,) 2022-03-17 13:50:07,194.194 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.bias loaded from bert.encoder.tag_blocks.3.attn.proj.bias of shape (768,) 2022-03-17 13:50:07,194.194 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.weight loaded from bert.encoder.tag_blocks.3.attn.proj.weight of shape (768, 768) 2022-03-17 13:50:07,194.194 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.bias loaded from bert.encoder.tag_blocks.3.attn.qkv.bias of shape (2304,) 2022-03-17 13:50:07,194.194 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.weight loaded from bert.encoder.tag_blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:50:07,194.194 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.bias loaded from bert.encoder.tag_blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-17 13:50:07,194.194 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.weight loaded from bert.encoder.tag_blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:50:07,194.194 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.bias loaded from bert.encoder.tag_blocks.3.mlp.fc2.bias of shape (768,) 2022-03-17 13:50:07,194.194 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.weight loaded from bert.encoder.tag_blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:50:07,194.194 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.bias loaded from bert.encoder.tag_blocks.3.norm1.bias of shape (768,) 2022-03-17 13:50:07,194.194 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.weight loaded from bert.encoder.tag_blocks.3.norm1.weight of shape (768,) 2022-03-17 13:50:07,194.194 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.bias loaded from bert.encoder.tag_blocks.3.norm2.bias of shape (768,) 2022-03-17 13:50:07,194.194 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.weight loaded from bert.encoder.tag_blocks.3.norm2.weight of shape (768,) 2022-03-17 13:50:07,194.194 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.bias loaded from bert.extra_embeddings.LayerNorm.bias of shape (768,) 2022-03-17 13:50:07,194.194 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.weight loaded from bert.extra_embeddings.LayerNorm.weight of shape (768,) 2022-03-17 13:50:07,195.195 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.position_embeddings.weight loaded from bert.extra_embeddings.position_embeddings.weight of shape (512, 768) 2022-03-17 13:50:07,195.195 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.token_type_embeddings.weight loaded from bert.extra_embeddings.token_type_embeddings.weight of shape (2, 768) 2022-03-17 13:50:07,195.195 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.word_embeddings.weight loaded from bert.extra_embeddings.word_embeddings.weight of shape (30522, 768) 2022-03-17 13:50:07,195.195 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.bias loaded from bert.pooler.dense.bias of shape (768,) 2022-03-17 13:50:07,195.195 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.weight loaded from bert.pooler.dense.weight of shape (768, 768) 2022-03-17 13:50:07,195.195 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.bias loaded from bert.tag_logit.predictions.bias of shape (30522,) 2022-03-17 13:50:07,195.195 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.decoder.weight loaded from bert.tag_logit.predictions.decoder.weight of shape (30522, 768) 2022-03-17 13:50:07,195.195 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.bias loaded from bert.tag_logit.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-17 13:50:07,195.195 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.weight loaded from bert.tag_logit.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-17 13:50:07,195.195 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.bias loaded from bert.tag_logit.predictions.transform.dense.bias of shape (768,) 2022-03-17 13:50:07,195.195 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.weight loaded from bert.tag_logit.predictions.transform.dense.weight of shape (768, 768) 2022-03-17 13:50:07,195.195 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.bias loaded from cls.predictions.bias of shape (30522,) 2022-03-17 13:50:07,195.195 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.decoder.weight loaded from cls.predictions.decoder.weight of shape (30522, 768) 2022-03-17 13:50:07,196.196 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.bias loaded from cls.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-17 13:50:07,196.196 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.weight loaded from cls.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-17 13:50:07,196.196 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.bias loaded from cls.predictions.transform.dense.bias of shape (768,) 2022-03-17 13:50:07,196.196 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.weight loaded from cls.predictions.transform.dense.weight of shape (768, 768) 2022-03-17 13:50:07,196.196 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 288; loaded = 288 2022-03-17 13:50:07,196.196 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = [] 2022-03-17 13:50:07,199.199 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = ['module.cls.predictions.decoder.weight']; total = 1 2022-03-17 13:50:07,360.360 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.cls.predictions.decoder.weight'] 2022-03-17 13:50:07,683.683 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = ; data = TaxCocoCaption; split = test; add_key = False; backend = cv; hold_buffer = 0; save_original = False 2022-03-17 13:50:07,709.709 2829:uni_pipeline.py:509 get_data_loader(): sampler = 2022-03-17 13:50:07,709.709 2829:uni_pipeline.py:1000 predict(): writing output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0035000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv 2022-03-17 13:50:07,781.781 2829:uni_pipeline.py:907 predict_iter(): DatasetPlusTransform(dataset=, transform=Compose( Compose( ImageTransform2Dict(image_transform=Compose( ToPILImage() Resize(size=384, interpolation=PIL.Image.BICUBIC) CenterCrop(size=(384, 384)) ToTensor() Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) )) ) LoadLabel(data=TaxCocoCaption, split=test, version=vinvl) TransCaptionTensorizer(tensorizer=, pad_to_max=True, pad_image_to_max=True) )) uni_pipeline.py:908: 0%| | 0/10 [00:00 locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi pip.monitor_train() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train need_wait_models = self.pred_eval_intermediate_models() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models pred = self.ensure_predict(model_file=model_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict self.predict(model_file, predict_result_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict model = self.get_model(is_train=False) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. --- Logging error --- Traceback (most recent call last): File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit msg = self.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format return fmt.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format record.message = record.getMessage() File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage msg = str(self.msg) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__ return str(self.to_json_string()) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n" File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps **kw).encode(obj) File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode chunks = list(chunks) File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode yield from _iterencode_dict(o, _current_indent_level) File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict yield from chunks File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode o = _default(o) File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default raise TypeError(f'Object of type {o.__class__.__name__} ' TypeError: Object of type BertTokenizer is not JSON serializable Call stack: File "src/qd/pipeline.py", line 1368, in locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi pip.monitor_train() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train need_wait_models = self.pred_eval_intermediate_models() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models pred = self.ensure_predict(model_file=model_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict self.predict(model_file, predict_result_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict model = self.get_model(is_train=False) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. 2022-03-17 13:52:20,314.314 2829:modeling_bert.py:529 __init__(): TIMM Split image encoder load from pre-trained: True 2022-03-17 13:52:21,381.381 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:52:22,920.920 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:52:23,568.568 2829:modeling_bert.py:2677 __init__(): BertImgModel Image Dimension: 2054 2022-03-17 13:52:23,716.716 2829:modeling_bert.py:2688 __init__(): BertImgModel Image Dimension: 2054 2022-03-17 13:52:24,506.506 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight! Note that this might be replaced by pre-trained checkpoint later! 2022-03-17 13:52:24,506.506 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode. 2022-03-17 13:52:25,639.639 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:52:26,170.170 2829:checkpoint.py:240 load(): Loading checkpoint from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt 2022-03-17 13:52:34,211.211 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.cls_token loaded from image_encoder.module.cls_token of shape (1, 1, 768) 2022-03-17 13:52:34,211.211 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.bias loaded from image_encoder.module.head.bias of shape (1000,) 2022-03-17 13:52:34,211.211 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.weight loaded from image_encoder.module.head.weight of shape (1000, 768) 2022-03-17 13:52:34,211.211 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.bias loaded from image_encoder.module.patch_embed.proj.bias of shape (768,) 2022-03-17 13:52:34,211.211 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.weight loaded from image_encoder.module.patch_embed.proj.weight of shape (768, 3, 16, 16) 2022-03-17 13:52:34,212.212 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.pos_embed loaded from image_encoder.module.pos_embed of shape (1, 577, 768) 2022-03-17 13:52:34,212.212 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.bias loaded from bert.caption_pooler.dense.bias of shape (768,) 2022-03-17 13:52:34,212.212 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.weight loaded from bert.caption_pooler.dense.weight of shape (768, 768) 2022-03-17 13:52:34,212.212 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.bias loaded from bert.decoder.layer.0.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:52:34,212.212 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.weight loaded from bert.decoder.layer.0.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:52:34,212.212 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.bias loaded from bert.decoder.layer.0.attention.output.dense.bias of shape (768,) 2022-03-17 13:52:34,212.212 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.weight loaded from bert.decoder.layer.0.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:52:34,212.212 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.bias loaded from bert.decoder.layer.0.attention.self.key.bias of shape (768,) 2022-03-17 13:52:34,212.212 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.weight loaded from bert.decoder.layer.0.attention.self.key.weight of shape (768, 768) 2022-03-17 13:52:34,212.212 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.bias loaded from bert.decoder.layer.0.attention.self.query.bias of shape (768,) 2022-03-17 13:52:34,212.212 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.weight loaded from bert.decoder.layer.0.attention.self.query.weight of shape (768, 768) 2022-03-17 13:52:34,212.212 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.bias loaded from bert.decoder.layer.0.attention.self.value.bias of shape (768,) 2022-03-17 13:52:34,213.213 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.weight loaded from bert.decoder.layer.0.attention.self.value.weight of shape (768, 768) 2022-03-17 13:52:34,213.213 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.bias loaded from bert.decoder.layer.0.intermediate.dense.bias of shape (3072,) 2022-03-17 13:52:34,213.213 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.weight loaded from bert.decoder.layer.0.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:52:34,213.213 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.bias loaded from bert.decoder.layer.0.output.LayerNorm.bias of shape (768,) 2022-03-17 13:52:34,213.213 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.weight loaded from bert.decoder.layer.0.output.LayerNorm.weight of shape (768,) 2022-03-17 13:52:34,213.213 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.bias loaded from bert.decoder.layer.0.output.dense.bias of shape (768,) 2022-03-17 13:52:34,213.213 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.weight loaded from bert.decoder.layer.0.output.dense.weight of shape (768, 3072) 2022-03-17 13:52:34,213.213 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.bias loaded from bert.decoder.layer.1.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:52:34,213.213 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.weight loaded from bert.decoder.layer.1.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:52:34,213.213 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.bias loaded from bert.decoder.layer.1.attention.output.dense.bias of shape (768,) 2022-03-17 13:52:34,213.213 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.weight loaded from bert.decoder.layer.1.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:52:34,213.213 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.bias loaded from bert.decoder.layer.1.attention.self.key.bias of shape (768,) 2022-03-17 13:52:34,213.213 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.weight loaded from bert.decoder.layer.1.attention.self.key.weight of shape (768, 768) 2022-03-17 13:52:34,213.213 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.bias loaded from bert.decoder.layer.1.attention.self.query.bias of shape (768,) 2022-03-17 13:52:34,214.214 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.weight loaded from bert.decoder.layer.1.attention.self.query.weight of shape (768, 768) 2022-03-17 13:52:34,214.214 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.bias loaded from bert.decoder.layer.1.attention.self.value.bias of shape (768,) 2022-03-17 13:52:34,214.214 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.weight loaded from bert.decoder.layer.1.attention.self.value.weight of shape (768, 768) 2022-03-17 13:52:34,214.214 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.bias loaded from bert.decoder.layer.1.intermediate.dense.bias of shape (3072,) 2022-03-17 13:52:34,214.214 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.weight loaded from bert.decoder.layer.1.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:52:34,214.214 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.bias loaded from bert.decoder.layer.1.output.LayerNorm.bias of shape (768,) 2022-03-17 13:52:34,214.214 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.weight loaded from bert.decoder.layer.1.output.LayerNorm.weight of shape (768,) 2022-03-17 13:52:34,214.214 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.bias loaded from bert.decoder.layer.1.output.dense.bias of shape (768,) 2022-03-17 13:52:34,214.214 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.weight loaded from bert.decoder.layer.1.output.dense.weight of shape (768, 3072) 2022-03-17 13:52:34,214.214 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.bias loaded from bert.decoder.layer.2.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:52:34,214.214 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.weight loaded from bert.decoder.layer.2.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:52:34,214.214 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.bias loaded from bert.decoder.layer.2.attention.output.dense.bias of shape (768,) 2022-03-17 13:52:34,214.214 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.weight loaded from bert.decoder.layer.2.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:52:34,214.214 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.bias loaded from bert.decoder.layer.2.attention.self.key.bias of shape (768,) 2022-03-17 13:52:34,215.215 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.weight loaded from bert.decoder.layer.2.attention.self.key.weight of shape (768, 768) 2022-03-17 13:52:34,215.215 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.bias loaded from bert.decoder.layer.2.attention.self.query.bias of shape (768,) 2022-03-17 13:52:34,215.215 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.weight loaded from bert.decoder.layer.2.attention.self.query.weight of shape (768, 768) 2022-03-17 13:52:34,215.215 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.bias loaded from bert.decoder.layer.2.attention.self.value.bias of shape (768,) 2022-03-17 13:52:34,215.215 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.weight loaded from bert.decoder.layer.2.attention.self.value.weight of shape (768, 768) 2022-03-17 13:52:34,215.215 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.bias loaded from bert.decoder.layer.2.intermediate.dense.bias of shape (3072,) 2022-03-17 13:52:34,215.215 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.weight loaded from bert.decoder.layer.2.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:52:34,215.215 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.bias loaded from bert.decoder.layer.2.output.LayerNorm.bias of shape (768,) 2022-03-17 13:52:34,215.215 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.weight loaded from bert.decoder.layer.2.output.LayerNorm.weight of shape (768,) 2022-03-17 13:52:34,215.215 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.bias loaded from bert.decoder.layer.2.output.dense.bias of shape (768,) 2022-03-17 13:52:34,215.215 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.weight loaded from bert.decoder.layer.2.output.dense.weight of shape (768, 3072) 2022-03-17 13:52:34,215.215 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.bias loaded from bert.decoder.layer.3.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:52:34,215.215 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.weight loaded from bert.decoder.layer.3.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:52:34,216.216 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.bias loaded from bert.decoder.layer.3.attention.output.dense.bias of shape (768,) 2022-03-17 13:52:34,216.216 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.weight loaded from bert.decoder.layer.3.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:52:34,216.216 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.bias loaded from bert.decoder.layer.3.attention.self.key.bias of shape (768,) 2022-03-17 13:52:34,216.216 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.weight loaded from bert.decoder.layer.3.attention.self.key.weight of shape (768, 768) 2022-03-17 13:52:34,216.216 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.bias loaded from bert.decoder.layer.3.attention.self.query.bias of shape (768,) 2022-03-17 13:52:34,216.216 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.weight loaded from bert.decoder.layer.3.attention.self.query.weight of shape (768, 768) 2022-03-17 13:52:34,216.216 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.bias loaded from bert.decoder.layer.3.attention.self.value.bias of shape (768,) 2022-03-17 13:52:34,216.216 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.weight loaded from bert.decoder.layer.3.attention.self.value.weight of shape (768, 768) 2022-03-17 13:52:34,216.216 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.bias loaded from bert.decoder.layer.3.intermediate.dense.bias of shape (3072,) 2022-03-17 13:52:34,216.216 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.weight loaded from bert.decoder.layer.3.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:52:34,216.216 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.bias loaded from bert.decoder.layer.3.output.LayerNorm.bias of shape (768,) 2022-03-17 13:52:34,216.216 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.weight loaded from bert.decoder.layer.3.output.LayerNorm.weight of shape (768,) 2022-03-17 13:52:34,216.216 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.bias loaded from bert.decoder.layer.3.output.dense.bias of shape (768,) 2022-03-17 13:52:34,217.217 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.weight loaded from bert.decoder.layer.3.output.dense.weight of shape (768, 3072) 2022-03-17 13:52:34,217.217 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.bias loaded from bert.embeddings.LayerNorm.bias of shape (768,) 2022-03-17 13:52:34,217.217 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.weight loaded from bert.embeddings.LayerNorm.weight of shape (768,) 2022-03-17 13:52:34,217.217 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.position_embeddings.weight loaded from bert.embeddings.position_embeddings.weight of shape (512, 768) 2022-03-17 13:52:34,217.217 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.token_type_embeddings.weight loaded from bert.embeddings.token_type_embeddings.weight of shape (2, 768) 2022-03-17 13:52:34,217.217 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.word_embeddings.weight loaded from bert.embeddings.word_embeddings.weight of shape (30522, 768) 2022-03-17 13:52:34,217.217 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.bias loaded from bert.encoder.blocks.0.attn.proj.bias of shape (768,) 2022-03-17 13:52:34,217.217 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.weight loaded from bert.encoder.blocks.0.attn.proj.weight of shape (768, 768) 2022-03-17 13:52:34,217.217 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.bias loaded from bert.encoder.blocks.0.attn.qkv.bias of shape (2304,) 2022-03-17 13:52:34,217.217 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.weight loaded from bert.encoder.blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:52:34,217.217 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.bias loaded from bert.encoder.blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-17 13:52:34,217.217 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.weight loaded from bert.encoder.blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:52:34,217.217 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.bias loaded from bert.encoder.blocks.0.mlp.fc2.bias of shape (768,) 2022-03-17 13:52:34,217.217 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.weight loaded from bert.encoder.blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:52:34,218.218 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.bias loaded from bert.encoder.blocks.0.norm1.bias of shape (768,) 2022-03-17 13:52:34,218.218 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.weight loaded from bert.encoder.blocks.0.norm1.weight of shape (768,) 2022-03-17 13:52:34,218.218 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.bias loaded from bert.encoder.blocks.0.norm2.bias of shape (768,) 2022-03-17 13:52:34,218.218 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.weight loaded from bert.encoder.blocks.0.norm2.weight of shape (768,) 2022-03-17 13:52:34,218.218 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.bias loaded from bert.encoder.blocks.1.attn.proj.bias of shape (768,) 2022-03-17 13:52:34,218.218 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.weight loaded from bert.encoder.blocks.1.attn.proj.weight of shape (768, 768) 2022-03-17 13:52:34,218.218 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.bias loaded from bert.encoder.blocks.1.attn.qkv.bias of shape (2304,) 2022-03-17 13:52:34,218.218 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.weight loaded from bert.encoder.blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:52:34,218.218 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.bias loaded from bert.encoder.blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-17 13:52:34,218.218 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.weight loaded from bert.encoder.blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:52:34,218.218 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.bias loaded from bert.encoder.blocks.1.mlp.fc2.bias of shape (768,) 2022-03-17 13:52:34,218.218 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.weight loaded from bert.encoder.blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:52:34,218.218 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.bias loaded from bert.encoder.blocks.1.norm1.bias of shape (768,) 2022-03-17 13:52:34,219.219 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.weight loaded from bert.encoder.blocks.1.norm1.weight of shape (768,) 2022-03-17 13:52:34,219.219 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.bias loaded from bert.encoder.blocks.1.norm2.bias of shape (768,) 2022-03-17 13:52:34,219.219 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.weight loaded from bert.encoder.blocks.1.norm2.weight of shape (768,) 2022-03-17 13:52:34,219.219 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.bias loaded from bert.encoder.blocks.10.attn.proj.bias of shape (768,) 2022-03-17 13:52:34,219.219 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.weight loaded from bert.encoder.blocks.10.attn.proj.weight of shape (768, 768) 2022-03-17 13:52:34,219.219 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.bias loaded from bert.encoder.blocks.10.attn.qkv.bias of shape (2304,) 2022-03-17 13:52:34,219.219 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.weight loaded from bert.encoder.blocks.10.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:52:34,219.219 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.bias loaded from bert.encoder.blocks.10.mlp.fc1.bias of shape (3072,) 2022-03-17 13:52:34,219.219 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.weight loaded from bert.encoder.blocks.10.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:52:34,219.219 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.bias loaded from bert.encoder.blocks.10.mlp.fc2.bias of shape (768,) 2022-03-17 13:52:34,219.219 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.weight loaded from bert.encoder.blocks.10.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:52:34,219.219 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.bias loaded from bert.encoder.blocks.10.norm1.bias of shape (768,) 2022-03-17 13:52:34,219.219 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.weight loaded from bert.encoder.blocks.10.norm1.weight of shape (768,) 2022-03-17 13:52:34,219.219 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.bias loaded from bert.encoder.blocks.10.norm2.bias of shape (768,) 2022-03-17 13:52:34,220.220 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.weight loaded from bert.encoder.blocks.10.norm2.weight of shape (768,) 2022-03-17 13:52:34,220.220 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.bias loaded from bert.encoder.blocks.11.attn.proj.bias of shape (768,) 2022-03-17 13:52:34,220.220 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.weight loaded from bert.encoder.blocks.11.attn.proj.weight of shape (768, 768) 2022-03-17 13:52:34,220.220 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.bias loaded from bert.encoder.blocks.11.attn.qkv.bias of shape (2304,) 2022-03-17 13:52:34,220.220 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.weight loaded from bert.encoder.blocks.11.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:52:34,220.220 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.bias loaded from bert.encoder.blocks.11.mlp.fc1.bias of shape (3072,) 2022-03-17 13:52:34,220.220 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.weight loaded from bert.encoder.blocks.11.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:52:34,220.220 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.bias loaded from bert.encoder.blocks.11.mlp.fc2.bias of shape (768,) 2022-03-17 13:52:34,220.220 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.weight loaded from bert.encoder.blocks.11.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:52:34,220.220 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.bias loaded from bert.encoder.blocks.11.norm1.bias of shape (768,) 2022-03-17 13:52:34,220.220 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.weight loaded from bert.encoder.blocks.11.norm1.weight of shape (768,) 2022-03-17 13:52:34,220.220 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.bias loaded from bert.encoder.blocks.11.norm2.bias of shape (768,) 2022-03-17 13:52:34,221.221 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.weight loaded from bert.encoder.blocks.11.norm2.weight of shape (768,) 2022-03-17 13:52:34,221.221 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.bias loaded from bert.encoder.blocks.2.attn.proj.bias of shape (768,) 2022-03-17 13:52:34,221.221 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.weight loaded from bert.encoder.blocks.2.attn.proj.weight of shape (768, 768) 2022-03-17 13:52:34,221.221 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.bias loaded from bert.encoder.blocks.2.attn.qkv.bias of shape (2304,) 2022-03-17 13:52:34,221.221 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.weight loaded from bert.encoder.blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:52:34,221.221 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.bias loaded from bert.encoder.blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-17 13:52:34,221.221 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.weight loaded from bert.encoder.blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:52:34,221.221 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.bias loaded from bert.encoder.blocks.2.mlp.fc2.bias of shape (768,) 2022-03-17 13:52:34,221.221 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.weight loaded from bert.encoder.blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:52:34,221.221 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.bias loaded from bert.encoder.blocks.2.norm1.bias of shape (768,) 2022-03-17 13:52:34,221.221 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.weight loaded from bert.encoder.blocks.2.norm1.weight of shape (768,) 2022-03-17 13:52:34,221.221 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.bias loaded from bert.encoder.blocks.2.norm2.bias of shape (768,) 2022-03-17 13:52:34,221.221 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.weight loaded from bert.encoder.blocks.2.norm2.weight of shape (768,) 2022-03-17 13:52:34,222.222 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.bias loaded from bert.encoder.blocks.3.attn.proj.bias of shape (768,) 2022-03-17 13:52:34,222.222 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.weight loaded from bert.encoder.blocks.3.attn.proj.weight of shape (768, 768) 2022-03-17 13:52:34,222.222 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.bias loaded from bert.encoder.blocks.3.attn.qkv.bias of shape (2304,) 2022-03-17 13:52:34,222.222 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.weight loaded from bert.encoder.blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:52:34,222.222 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.bias loaded from bert.encoder.blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-17 13:52:34,222.222 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.weight loaded from bert.encoder.blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:52:34,222.222 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.bias loaded from bert.encoder.blocks.3.mlp.fc2.bias of shape (768,) 2022-03-17 13:52:34,222.222 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.weight loaded from bert.encoder.blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:52:34,222.222 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.bias loaded from bert.encoder.blocks.3.norm1.bias of shape (768,) 2022-03-17 13:52:34,222.222 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.weight loaded from bert.encoder.blocks.3.norm1.weight of shape (768,) 2022-03-17 13:52:34,222.222 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.bias loaded from bert.encoder.blocks.3.norm2.bias of shape (768,) 2022-03-17 13:52:34,222.222 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.weight loaded from bert.encoder.blocks.3.norm2.weight of shape (768,) 2022-03-17 13:52:34,222.222 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.bias loaded from bert.encoder.blocks.4.attn.proj.bias of shape (768,) 2022-03-17 13:52:34,223.223 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.weight loaded from bert.encoder.blocks.4.attn.proj.weight of shape (768, 768) 2022-03-17 13:52:34,223.223 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.bias loaded from bert.encoder.blocks.4.attn.qkv.bias of shape (2304,) 2022-03-17 13:52:34,223.223 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.weight loaded from bert.encoder.blocks.4.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:52:34,223.223 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.bias loaded from bert.encoder.blocks.4.mlp.fc1.bias of shape (3072,) 2022-03-17 13:52:34,223.223 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.weight loaded from bert.encoder.blocks.4.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:52:34,223.223 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.bias loaded from bert.encoder.blocks.4.mlp.fc2.bias of shape (768,) 2022-03-17 13:52:34,223.223 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.weight loaded from bert.encoder.blocks.4.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:52:34,223.223 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.bias loaded from bert.encoder.blocks.4.norm1.bias of shape (768,) 2022-03-17 13:52:34,223.223 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.weight loaded from bert.encoder.blocks.4.norm1.weight of shape (768,) 2022-03-17 13:52:34,223.223 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.bias loaded from bert.encoder.blocks.4.norm2.bias of shape (768,) 2022-03-17 13:52:34,223.223 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.weight loaded from bert.encoder.blocks.4.norm2.weight of shape (768,) 2022-03-17 13:52:34,223.223 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.bias loaded from bert.encoder.blocks.5.attn.proj.bias of shape (768,) 2022-03-17 13:52:34,223.223 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.weight loaded from bert.encoder.blocks.5.attn.proj.weight of shape (768, 768) 2022-03-17 13:52:34,224.224 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.bias loaded from bert.encoder.blocks.5.attn.qkv.bias of shape (2304,) 2022-03-17 13:52:34,224.224 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.weight loaded from bert.encoder.blocks.5.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:52:34,224.224 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.bias loaded from bert.encoder.blocks.5.mlp.fc1.bias of shape (3072,) 2022-03-17 13:52:34,224.224 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.weight loaded from bert.encoder.blocks.5.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:52:34,224.224 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.bias loaded from bert.encoder.blocks.5.mlp.fc2.bias of shape (768,) 2022-03-17 13:52:34,224.224 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.weight loaded from bert.encoder.blocks.5.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:52:34,224.224 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.bias loaded from bert.encoder.blocks.5.norm1.bias of shape (768,) 2022-03-17 13:52:34,224.224 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.weight loaded from bert.encoder.blocks.5.norm1.weight of shape (768,) 2022-03-17 13:52:34,224.224 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.bias loaded from bert.encoder.blocks.5.norm2.bias of shape (768,) 2022-03-17 13:52:34,224.224 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.weight loaded from bert.encoder.blocks.5.norm2.weight of shape (768,) 2022-03-17 13:52:34,224.224 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.bias loaded from bert.encoder.blocks.6.attn.proj.bias of shape (768,) 2022-03-17 13:52:34,224.224 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.weight loaded from bert.encoder.blocks.6.attn.proj.weight of shape (768, 768) 2022-03-17 13:52:34,224.224 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.bias loaded from bert.encoder.blocks.6.attn.qkv.bias of shape (2304,) 2022-03-17 13:52:34,225.225 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.weight loaded from bert.encoder.blocks.6.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:52:34,225.225 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.bias loaded from bert.encoder.blocks.6.mlp.fc1.bias of shape (3072,) 2022-03-17 13:52:34,225.225 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.weight loaded from bert.encoder.blocks.6.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:52:34,225.225 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.bias loaded from bert.encoder.blocks.6.mlp.fc2.bias of shape (768,) 2022-03-17 13:52:34,225.225 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.weight loaded from bert.encoder.blocks.6.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:52:34,225.225 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.bias loaded from bert.encoder.blocks.6.norm1.bias of shape (768,) 2022-03-17 13:52:34,225.225 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.weight loaded from bert.encoder.blocks.6.norm1.weight of shape (768,) 2022-03-17 13:52:34,225.225 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.bias loaded from bert.encoder.blocks.6.norm2.bias of shape (768,) 2022-03-17 13:52:34,225.225 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.weight loaded from bert.encoder.blocks.6.norm2.weight of shape (768,) 2022-03-17 13:52:34,225.225 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.bias loaded from bert.encoder.blocks.7.attn.proj.bias of shape (768,) 2022-03-17 13:52:34,225.225 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.weight loaded from bert.encoder.blocks.7.attn.proj.weight of shape (768, 768) 2022-03-17 13:52:34,225.225 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.bias loaded from bert.encoder.blocks.7.attn.qkv.bias of shape (2304,) 2022-03-17 13:52:34,225.225 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.weight loaded from bert.encoder.blocks.7.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:52:34,225.225 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.bias loaded from bert.encoder.blocks.7.mlp.fc1.bias of shape (3072,) 2022-03-17 13:52:34,226.226 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.weight loaded from bert.encoder.blocks.7.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:52:34,226.226 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.bias loaded from bert.encoder.blocks.7.mlp.fc2.bias of shape (768,) 2022-03-17 13:52:34,226.226 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.weight loaded from bert.encoder.blocks.7.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:52:34,226.226 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.bias loaded from bert.encoder.blocks.7.norm1.bias of shape (768,) 2022-03-17 13:52:34,226.226 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.weight loaded from bert.encoder.blocks.7.norm1.weight of shape (768,) 2022-03-17 13:52:34,226.226 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.bias loaded from bert.encoder.blocks.7.norm2.bias of shape (768,) 2022-03-17 13:52:34,226.226 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.weight loaded from bert.encoder.blocks.7.norm2.weight of shape (768,) 2022-03-17 13:52:34,226.226 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.bias loaded from bert.encoder.blocks.8.attn.proj.bias of shape (768,) 2022-03-17 13:52:34,226.226 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.weight loaded from bert.encoder.blocks.8.attn.proj.weight of shape (768, 768) 2022-03-17 13:52:34,226.226 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.bias loaded from bert.encoder.blocks.8.attn.qkv.bias of shape (2304,) 2022-03-17 13:52:34,226.226 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.weight loaded from bert.encoder.blocks.8.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:52:34,226.226 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.bias loaded from bert.encoder.blocks.8.mlp.fc1.bias of shape (3072,) 2022-03-17 13:52:34,226.226 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.weight loaded from bert.encoder.blocks.8.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:52:34,226.226 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.bias loaded from bert.encoder.blocks.8.mlp.fc2.bias of shape (768,) 2022-03-17 13:52:34,227.227 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.weight loaded from bert.encoder.blocks.8.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:52:34,227.227 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.bias loaded from bert.encoder.blocks.8.norm1.bias of shape (768,) 2022-03-17 13:52:34,227.227 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.weight loaded from bert.encoder.blocks.8.norm1.weight of shape (768,) 2022-03-17 13:52:34,227.227 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.bias loaded from bert.encoder.blocks.8.norm2.bias of shape (768,) 2022-03-17 13:52:34,227.227 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.weight loaded from bert.encoder.blocks.8.norm2.weight of shape (768,) 2022-03-17 13:52:34,227.227 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.bias loaded from bert.encoder.blocks.9.attn.proj.bias of shape (768,) 2022-03-17 13:52:34,227.227 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.weight loaded from bert.encoder.blocks.9.attn.proj.weight of shape (768, 768) 2022-03-17 13:52:34,227.227 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.bias loaded from bert.encoder.blocks.9.attn.qkv.bias of shape (2304,) 2022-03-17 13:52:34,227.227 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.weight loaded from bert.encoder.blocks.9.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:52:34,227.227 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.bias loaded from bert.encoder.blocks.9.mlp.fc1.bias of shape (3072,) 2022-03-17 13:52:34,227.227 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.weight loaded from bert.encoder.blocks.9.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:52:34,227.227 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.bias loaded from bert.encoder.blocks.9.mlp.fc2.bias of shape (768,) 2022-03-17 13:52:34,227.227 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.weight loaded from bert.encoder.blocks.9.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:52:34,227.227 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.bias loaded from bert.encoder.blocks.9.norm1.bias of shape (768,) 2022-03-17 13:52:34,228.228 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.weight loaded from bert.encoder.blocks.9.norm1.weight of shape (768,) 2022-03-17 13:52:34,228.228 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.bias loaded from bert.encoder.blocks.9.norm2.bias of shape (768,) 2022-03-17 13:52:34,228.228 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.weight loaded from bert.encoder.blocks.9.norm2.weight of shape (768,) 2022-03-17 13:52:34,228.228 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.bias loaded from bert.encoder.tag_blocks.0.attn.proj.bias of shape (768,) 2022-03-17 13:52:34,228.228 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.weight loaded from bert.encoder.tag_blocks.0.attn.proj.weight of shape (768, 768) 2022-03-17 13:52:34,228.228 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.bias loaded from bert.encoder.tag_blocks.0.attn.qkv.bias of shape (2304,) 2022-03-17 13:52:34,228.228 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.weight loaded from bert.encoder.tag_blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:52:34,228.228 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.bias loaded from bert.encoder.tag_blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-17 13:52:34,228.228 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.weight loaded from bert.encoder.tag_blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:52:34,228.228 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.bias loaded from bert.encoder.tag_blocks.0.mlp.fc2.bias of shape (768,) 2022-03-17 13:52:34,228.228 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.weight loaded from bert.encoder.tag_blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:52:34,228.228 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.bias loaded from bert.encoder.tag_blocks.0.norm1.bias of shape (768,) 2022-03-17 13:52:34,228.228 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.weight loaded from bert.encoder.tag_blocks.0.norm1.weight of shape (768,) 2022-03-17 13:52:34,229.229 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.bias loaded from bert.encoder.tag_blocks.0.norm2.bias of shape (768,) 2022-03-17 13:52:34,229.229 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.weight loaded from bert.encoder.tag_blocks.0.norm2.weight of shape (768,) 2022-03-17 13:52:34,229.229 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.bias loaded from bert.encoder.tag_blocks.1.attn.proj.bias of shape (768,) 2022-03-17 13:52:34,229.229 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.weight loaded from bert.encoder.tag_blocks.1.attn.proj.weight of shape (768, 768) 2022-03-17 13:52:34,229.229 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.bias loaded from bert.encoder.tag_blocks.1.attn.qkv.bias of shape (2304,) 2022-03-17 13:52:34,229.229 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.weight loaded from bert.encoder.tag_blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:52:34,229.229 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.bias loaded from bert.encoder.tag_blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-17 13:52:34,229.229 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.weight loaded from bert.encoder.tag_blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:52:34,229.229 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.bias loaded from bert.encoder.tag_blocks.1.mlp.fc2.bias of shape (768,) 2022-03-17 13:52:34,229.229 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.weight loaded from bert.encoder.tag_blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:52:34,229.229 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.bias loaded from bert.encoder.tag_blocks.1.norm1.bias of shape (768,) 2022-03-17 13:52:34,229.229 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.weight loaded from bert.encoder.tag_blocks.1.norm1.weight of shape (768,) 2022-03-17 13:52:34,229.229 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.bias loaded from bert.encoder.tag_blocks.1.norm2.bias of shape (768,) 2022-03-17 13:52:34,229.229 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.weight loaded from bert.encoder.tag_blocks.1.norm2.weight of shape (768,) 2022-03-17 13:52:34,230.230 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.bias loaded from bert.encoder.tag_blocks.2.attn.proj.bias of shape (768,) 2022-03-17 13:52:34,230.230 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.weight loaded from bert.encoder.tag_blocks.2.attn.proj.weight of shape (768, 768) 2022-03-17 13:52:34,230.230 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.bias loaded from bert.encoder.tag_blocks.2.attn.qkv.bias of shape (2304,) 2022-03-17 13:52:34,230.230 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.weight loaded from bert.encoder.tag_blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:52:34,230.230 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.bias loaded from bert.encoder.tag_blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-17 13:52:34,230.230 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.weight loaded from bert.encoder.tag_blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:52:34,230.230 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.bias loaded from bert.encoder.tag_blocks.2.mlp.fc2.bias of shape (768,) 2022-03-17 13:52:34,230.230 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.weight loaded from bert.encoder.tag_blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:52:34,230.230 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.bias loaded from bert.encoder.tag_blocks.2.norm1.bias of shape (768,) 2022-03-17 13:52:34,230.230 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.weight loaded from bert.encoder.tag_blocks.2.norm1.weight of shape (768,) 2022-03-17 13:52:34,230.230 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.bias loaded from bert.encoder.tag_blocks.2.norm2.bias of shape (768,) 2022-03-17 13:52:34,230.230 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.weight loaded from bert.encoder.tag_blocks.2.norm2.weight of shape (768,) 2022-03-17 13:52:34,230.230 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.bias loaded from bert.encoder.tag_blocks.3.attn.proj.bias of shape (768,) 2022-03-17 13:52:34,231.231 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.weight loaded from bert.encoder.tag_blocks.3.attn.proj.weight of shape (768, 768) 2022-03-17 13:52:34,231.231 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.bias loaded from bert.encoder.tag_blocks.3.attn.qkv.bias of shape (2304,) 2022-03-17 13:52:34,231.231 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.weight loaded from bert.encoder.tag_blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:52:34,231.231 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.bias loaded from bert.encoder.tag_blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-17 13:52:34,231.231 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.weight loaded from bert.encoder.tag_blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:52:34,231.231 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.bias loaded from bert.encoder.tag_blocks.3.mlp.fc2.bias of shape (768,) 2022-03-17 13:52:34,231.231 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.weight loaded from bert.encoder.tag_blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:52:34,231.231 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.bias loaded from bert.encoder.tag_blocks.3.norm1.bias of shape (768,) 2022-03-17 13:52:34,231.231 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.weight loaded from bert.encoder.tag_blocks.3.norm1.weight of shape (768,) 2022-03-17 13:52:34,231.231 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.bias loaded from bert.encoder.tag_blocks.3.norm2.bias of shape (768,) 2022-03-17 13:52:34,231.231 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.weight loaded from bert.encoder.tag_blocks.3.norm2.weight of shape (768,) 2022-03-17 13:52:34,231.231 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.bias loaded from bert.extra_embeddings.LayerNorm.bias of shape (768,) 2022-03-17 13:52:34,231.231 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.weight loaded from bert.extra_embeddings.LayerNorm.weight of shape (768,) 2022-03-17 13:52:34,231.231 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.position_embeddings.weight loaded from bert.extra_embeddings.position_embeddings.weight of shape (512, 768) 2022-03-17 13:52:34,232.232 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.token_type_embeddings.weight loaded from bert.extra_embeddings.token_type_embeddings.weight of shape (2, 768) 2022-03-17 13:52:34,232.232 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.word_embeddings.weight loaded from bert.extra_embeddings.word_embeddings.weight of shape (30522, 768) 2022-03-17 13:52:34,232.232 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.bias loaded from bert.pooler.dense.bias of shape (768,) 2022-03-17 13:52:34,232.232 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.weight loaded from bert.pooler.dense.weight of shape (768, 768) 2022-03-17 13:52:34,232.232 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.bias loaded from bert.tag_logit.predictions.bias of shape (30522,) 2022-03-17 13:52:34,232.232 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.decoder.weight loaded from bert.tag_logit.predictions.decoder.weight of shape (30522, 768) 2022-03-17 13:52:34,232.232 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.bias loaded from bert.tag_logit.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-17 13:52:34,232.232 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.weight loaded from bert.tag_logit.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-17 13:52:34,232.232 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.bias loaded from bert.tag_logit.predictions.transform.dense.bias of shape (768,) 2022-03-17 13:52:34,232.232 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.weight loaded from bert.tag_logit.predictions.transform.dense.weight of shape (768, 768) 2022-03-17 13:52:34,232.232 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.bias loaded from cls.predictions.bias of shape (30522,) 2022-03-17 13:52:34,232.232 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.decoder.weight loaded from cls.predictions.decoder.weight of shape (30522, 768) 2022-03-17 13:52:34,232.232 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.bias loaded from cls.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-17 13:52:34,233.233 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.weight loaded from cls.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-17 13:52:34,233.233 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.bias loaded from cls.predictions.transform.dense.bias of shape (768,) 2022-03-17 13:52:34,233.233 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.weight loaded from cls.predictions.transform.dense.weight of shape (768, 768) 2022-03-17 13:52:34,233.233 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 288; loaded = 288 2022-03-17 13:52:34,233.233 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = [] 2022-03-17 13:52:34,236.236 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = ['module.cls.predictions.decoder.weight']; total = 1 2022-03-17 13:52:34,404.404 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.cls.predictions.decoder.weight'] 2022-03-17 13:52:34,732.732 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = ; data = TaxCocoCaption; split = test; add_key = False; backend = cv; hold_buffer = 0; save_original = False 2022-03-17 13:52:34,757.757 2829:uni_pipeline.py:509 get_data_loader(): sampler = 2022-03-17 13:52:34,758.758 2829:uni_pipeline.py:1000 predict(): writing output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0040000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv 2022-03-17 13:52:34,830.830 2829:uni_pipeline.py:907 predict_iter(): DatasetPlusTransform(dataset=, transform=Compose( Compose( ImageTransform2Dict(image_transform=Compose( ToPILImage() Resize(size=384, interpolation=PIL.Image.BICUBIC) CenterCrop(size=(384, 384)) ToTensor() Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) )) ) LoadLabel(data=TaxCocoCaption, split=test, version=vinvl) TransCaptionTensorizer(tensorizer=, pad_to_max=True, pad_image_to_max=True) )) uni_pipeline.py:908: 0%| | 0/10 [00:00 locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi pip.monitor_train() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train need_wait_models = self.pred_eval_intermediate_models() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models pred = self.ensure_predict(model_file=model_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict self.predict(model_file, predict_result_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict model = self.get_model(is_train=False) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. --- Logging error --- Traceback (most recent call last): File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit msg = self.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format return fmt.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format record.message = record.getMessage() File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage msg = str(self.msg) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__ return str(self.to_json_string()) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n" File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps **kw).encode(obj) File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode chunks = list(chunks) File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode yield from _iterencode_dict(o, _current_indent_level) File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict yield from chunks File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode o = _default(o) File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default raise TypeError(f'Object of type {o.__class__.__name__} ' TypeError: Object of type BertTokenizer is not JSON serializable Call stack: File "src/qd/pipeline.py", line 1368, in locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi pip.monitor_train() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train need_wait_models = self.pred_eval_intermediate_models() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models pred = self.ensure_predict(model_file=model_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict self.predict(model_file, predict_result_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict model = self.get_model(is_train=False) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. 2022-03-17 13:54:47,895.895 2829:modeling_bert.py:529 __init__(): TIMM Split image encoder load from pre-trained: True 2022-03-17 13:54:48,963.963 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:54:50,469.469 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:54:51,106.106 2829:modeling_bert.py:2677 __init__(): BertImgModel Image Dimension: 2054 2022-03-17 13:54:51,254.254 2829:modeling_bert.py:2688 __init__(): BertImgModel Image Dimension: 2054 2022-03-17 13:54:52,024.024 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight! Note that this might be replaced by pre-trained checkpoint later! 2022-03-17 13:54:52,024.024 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode. 2022-03-17 13:54:53,137.137 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:54:53,734.734 2829:checkpoint.py:240 load(): Loading checkpoint from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt 2022-03-17 13:55:01,820.820 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.cls_token loaded from image_encoder.module.cls_token of shape (1, 1, 768) 2022-03-17 13:55:01,821.821 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.bias loaded from image_encoder.module.head.bias of shape (1000,) 2022-03-17 13:55:01,821.821 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.weight loaded from image_encoder.module.head.weight of shape (1000, 768) 2022-03-17 13:55:01,821.821 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.bias loaded from image_encoder.module.patch_embed.proj.bias of shape (768,) 2022-03-17 13:55:01,821.821 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.weight loaded from image_encoder.module.patch_embed.proj.weight of shape (768, 3, 16, 16) 2022-03-17 13:55:01,821.821 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.pos_embed loaded from image_encoder.module.pos_embed of shape (1, 577, 768) 2022-03-17 13:55:01,821.821 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.bias loaded from bert.caption_pooler.dense.bias of shape (768,) 2022-03-17 13:55:01,821.821 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.weight loaded from bert.caption_pooler.dense.weight of shape (768, 768) 2022-03-17 13:55:01,821.821 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.bias loaded from bert.decoder.layer.0.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:55:01,822.822 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.weight loaded from bert.decoder.layer.0.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:55:01,822.822 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.bias loaded from bert.decoder.layer.0.attention.output.dense.bias of shape (768,) 2022-03-17 13:55:01,822.822 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.weight loaded from bert.decoder.layer.0.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:55:01,822.822 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.bias loaded from bert.decoder.layer.0.attention.self.key.bias of shape (768,) 2022-03-17 13:55:01,822.822 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.weight loaded from bert.decoder.layer.0.attention.self.key.weight of shape (768, 768) 2022-03-17 13:55:01,822.822 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.bias loaded from bert.decoder.layer.0.attention.self.query.bias of shape (768,) 2022-03-17 13:55:01,822.822 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.weight loaded from bert.decoder.layer.0.attention.self.query.weight of shape (768, 768) 2022-03-17 13:55:01,822.822 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.bias loaded from bert.decoder.layer.0.attention.self.value.bias of shape (768,) 2022-03-17 13:55:01,822.822 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.weight loaded from bert.decoder.layer.0.attention.self.value.weight of shape (768, 768) 2022-03-17 13:55:01,822.822 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.bias loaded from bert.decoder.layer.0.intermediate.dense.bias of shape (3072,) 2022-03-17 13:55:01,822.822 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.weight loaded from bert.decoder.layer.0.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:55:01,822.822 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.bias loaded from bert.decoder.layer.0.output.LayerNorm.bias of shape (768,) 2022-03-17 13:55:01,822.822 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.weight loaded from bert.decoder.layer.0.output.LayerNorm.weight of shape (768,) 2022-03-17 13:55:01,823.823 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.bias loaded from bert.decoder.layer.0.output.dense.bias of shape (768,) 2022-03-17 13:55:01,823.823 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.weight loaded from bert.decoder.layer.0.output.dense.weight of shape (768, 3072) 2022-03-17 13:55:01,823.823 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.bias loaded from bert.decoder.layer.1.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:55:01,823.823 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.weight loaded from bert.decoder.layer.1.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:55:01,823.823 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.bias loaded from bert.decoder.layer.1.attention.output.dense.bias of shape (768,) 2022-03-17 13:55:01,823.823 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.weight loaded from bert.decoder.layer.1.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:55:01,823.823 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.bias loaded from bert.decoder.layer.1.attention.self.key.bias of shape (768,) 2022-03-17 13:55:01,823.823 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.weight loaded from bert.decoder.layer.1.attention.self.key.weight of shape (768, 768) 2022-03-17 13:55:01,823.823 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.bias loaded from bert.decoder.layer.1.attention.self.query.bias of shape (768,) 2022-03-17 13:55:01,823.823 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.weight loaded from bert.decoder.layer.1.attention.self.query.weight of shape (768, 768) 2022-03-17 13:55:01,823.823 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.bias loaded from bert.decoder.layer.1.attention.self.value.bias of shape (768,) 2022-03-17 13:55:01,823.823 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.weight loaded from bert.decoder.layer.1.attention.self.value.weight of shape (768, 768) 2022-03-17 13:55:01,823.823 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.bias loaded from bert.decoder.layer.1.intermediate.dense.bias of shape (3072,) 2022-03-17 13:55:01,823.823 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.weight loaded from bert.decoder.layer.1.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:55:01,824.824 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.bias loaded from bert.decoder.layer.1.output.LayerNorm.bias of shape (768,) 2022-03-17 13:55:01,824.824 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.weight loaded from bert.decoder.layer.1.output.LayerNorm.weight of shape (768,) 2022-03-17 13:55:01,824.824 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.bias loaded from bert.decoder.layer.1.output.dense.bias of shape (768,) 2022-03-17 13:55:01,824.824 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.weight loaded from bert.decoder.layer.1.output.dense.weight of shape (768, 3072) 2022-03-17 13:55:01,824.824 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.bias loaded from bert.decoder.layer.2.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:55:01,824.824 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.weight loaded from bert.decoder.layer.2.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:55:01,824.824 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.bias loaded from bert.decoder.layer.2.attention.output.dense.bias of shape (768,) 2022-03-17 13:55:01,824.824 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.weight loaded from bert.decoder.layer.2.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:55:01,824.824 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.bias loaded from bert.decoder.layer.2.attention.self.key.bias of shape (768,) 2022-03-17 13:55:01,824.824 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.weight loaded from bert.decoder.layer.2.attention.self.key.weight of shape (768, 768) 2022-03-17 13:55:01,824.824 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.bias loaded from bert.decoder.layer.2.attention.self.query.bias of shape (768,) 2022-03-17 13:55:01,824.824 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.weight loaded from bert.decoder.layer.2.attention.self.query.weight of shape (768, 768) 2022-03-17 13:55:01,824.824 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.bias loaded from bert.decoder.layer.2.attention.self.value.bias of shape (768,) 2022-03-17 13:55:01,825.825 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.weight loaded from bert.decoder.layer.2.attention.self.value.weight of shape (768, 768) 2022-03-17 13:55:01,825.825 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.bias loaded from bert.decoder.layer.2.intermediate.dense.bias of shape (3072,) 2022-03-17 13:55:01,825.825 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.weight loaded from bert.decoder.layer.2.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:55:01,825.825 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.bias loaded from bert.decoder.layer.2.output.LayerNorm.bias of shape (768,) 2022-03-17 13:55:01,825.825 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.weight loaded from bert.decoder.layer.2.output.LayerNorm.weight of shape (768,) 2022-03-17 13:55:01,825.825 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.bias loaded from bert.decoder.layer.2.output.dense.bias of shape (768,) 2022-03-17 13:55:01,825.825 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.weight loaded from bert.decoder.layer.2.output.dense.weight of shape (768, 3072) 2022-03-17 13:55:01,825.825 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.bias loaded from bert.decoder.layer.3.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:55:01,825.825 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.weight loaded from bert.decoder.layer.3.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:55:01,825.825 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.bias loaded from bert.decoder.layer.3.attention.output.dense.bias of shape (768,) 2022-03-17 13:55:01,825.825 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.weight loaded from bert.decoder.layer.3.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:55:01,825.825 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.bias loaded from bert.decoder.layer.3.attention.self.key.bias of shape (768,) 2022-03-17 13:55:01,825.825 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.weight loaded from bert.decoder.layer.3.attention.self.key.weight of shape (768, 768) 2022-03-17 13:55:01,826.826 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.bias loaded from bert.decoder.layer.3.attention.self.query.bias of shape (768,) 2022-03-17 13:55:01,826.826 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.weight loaded from bert.decoder.layer.3.attention.self.query.weight of shape (768, 768) 2022-03-17 13:55:01,826.826 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.bias loaded from bert.decoder.layer.3.attention.self.value.bias of shape (768,) 2022-03-17 13:55:01,826.826 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.weight loaded from bert.decoder.layer.3.attention.self.value.weight of shape (768, 768) 2022-03-17 13:55:01,826.826 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.bias loaded from bert.decoder.layer.3.intermediate.dense.bias of shape (3072,) 2022-03-17 13:55:01,826.826 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.weight loaded from bert.decoder.layer.3.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:55:01,826.826 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.bias loaded from bert.decoder.layer.3.output.LayerNorm.bias of shape (768,) 2022-03-17 13:55:01,826.826 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.weight loaded from bert.decoder.layer.3.output.LayerNorm.weight of shape (768,) 2022-03-17 13:55:01,826.826 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.bias loaded from bert.decoder.layer.3.output.dense.bias of shape (768,) 2022-03-17 13:55:01,826.826 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.weight loaded from bert.decoder.layer.3.output.dense.weight of shape (768, 3072) 2022-03-17 13:55:01,826.826 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.bias loaded from bert.embeddings.LayerNorm.bias of shape (768,) 2022-03-17 13:55:01,826.826 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.weight loaded from bert.embeddings.LayerNorm.weight of shape (768,) 2022-03-17 13:55:01,826.826 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.position_embeddings.weight loaded from bert.embeddings.position_embeddings.weight of shape (512, 768) 2022-03-17 13:55:01,827.827 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.token_type_embeddings.weight loaded from bert.embeddings.token_type_embeddings.weight of shape (2, 768) 2022-03-17 13:55:01,827.827 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.word_embeddings.weight loaded from bert.embeddings.word_embeddings.weight of shape (30522, 768) 2022-03-17 13:55:01,827.827 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.bias loaded from bert.encoder.blocks.0.attn.proj.bias of shape (768,) 2022-03-17 13:55:01,827.827 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.weight loaded from bert.encoder.blocks.0.attn.proj.weight of shape (768, 768) 2022-03-17 13:55:01,827.827 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.bias loaded from bert.encoder.blocks.0.attn.qkv.bias of shape (2304,) 2022-03-17 13:55:01,827.827 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.weight loaded from bert.encoder.blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:55:01,827.827 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.bias loaded from bert.encoder.blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-17 13:55:01,827.827 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.weight loaded from bert.encoder.blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:55:01,827.827 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.bias loaded from bert.encoder.blocks.0.mlp.fc2.bias of shape (768,) 2022-03-17 13:55:01,827.827 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.weight loaded from bert.encoder.blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:55:01,827.827 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.bias loaded from bert.encoder.blocks.0.norm1.bias of shape (768,) 2022-03-17 13:55:01,827.827 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.weight loaded from bert.encoder.blocks.0.norm1.weight of shape (768,) 2022-03-17 13:55:01,827.827 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.bias loaded from bert.encoder.blocks.0.norm2.bias of shape (768,) 2022-03-17 13:55:01,828.828 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.weight loaded from bert.encoder.blocks.0.norm2.weight of shape (768,) 2022-03-17 13:55:01,828.828 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.bias loaded from bert.encoder.blocks.1.attn.proj.bias of shape (768,) 2022-03-17 13:55:01,828.828 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.weight loaded from bert.encoder.blocks.1.attn.proj.weight of shape (768, 768) 2022-03-17 13:55:01,828.828 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.bias loaded from bert.encoder.blocks.1.attn.qkv.bias of shape (2304,) 2022-03-17 13:55:01,828.828 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.weight loaded from bert.encoder.blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:55:01,828.828 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.bias loaded from bert.encoder.blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-17 13:55:01,828.828 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.weight loaded from bert.encoder.blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:55:01,828.828 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.bias loaded from bert.encoder.blocks.1.mlp.fc2.bias of shape (768,) 2022-03-17 13:55:01,828.828 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.weight loaded from bert.encoder.blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:55:01,828.828 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.bias loaded from bert.encoder.blocks.1.norm1.bias of shape (768,) 2022-03-17 13:55:01,828.828 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.weight loaded from bert.encoder.blocks.1.norm1.weight of shape (768,) 2022-03-17 13:55:01,828.828 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.bias loaded from bert.encoder.blocks.1.norm2.bias of shape (768,) 2022-03-17 13:55:01,828.828 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.weight loaded from bert.encoder.blocks.1.norm2.weight of shape (768,) 2022-03-17 13:55:01,829.829 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.bias loaded from bert.encoder.blocks.10.attn.proj.bias of shape (768,) 2022-03-17 13:55:01,829.829 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.weight loaded from bert.encoder.blocks.10.attn.proj.weight of shape (768, 768) 2022-03-17 13:55:01,829.829 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.bias loaded from bert.encoder.blocks.10.attn.qkv.bias of shape (2304,) 2022-03-17 13:55:01,829.829 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.weight loaded from bert.encoder.blocks.10.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:55:01,829.829 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.bias loaded from bert.encoder.blocks.10.mlp.fc1.bias of shape (3072,) 2022-03-17 13:55:01,829.829 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.weight loaded from bert.encoder.blocks.10.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:55:01,829.829 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.bias loaded from bert.encoder.blocks.10.mlp.fc2.bias of shape (768,) 2022-03-17 13:55:01,829.829 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.weight loaded from bert.encoder.blocks.10.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:55:01,829.829 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.bias loaded from bert.encoder.blocks.10.norm1.bias of shape (768,) 2022-03-17 13:55:01,829.829 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.weight loaded from bert.encoder.blocks.10.norm1.weight of shape (768,) 2022-03-17 13:55:01,829.829 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.bias loaded from bert.encoder.blocks.10.norm2.bias of shape (768,) 2022-03-17 13:55:01,829.829 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.weight loaded from bert.encoder.blocks.10.norm2.weight of shape (768,) 2022-03-17 13:55:01,829.829 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.bias loaded from bert.encoder.blocks.11.attn.proj.bias of shape (768,) 2022-03-17 13:55:01,830.830 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.weight loaded from bert.encoder.blocks.11.attn.proj.weight of shape (768, 768) 2022-03-17 13:55:01,830.830 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.bias loaded from bert.encoder.blocks.11.attn.qkv.bias of shape (2304,) 2022-03-17 13:55:01,830.830 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.weight loaded from bert.encoder.blocks.11.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:55:01,830.830 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.bias loaded from bert.encoder.blocks.11.mlp.fc1.bias of shape (3072,) 2022-03-17 13:55:01,830.830 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.weight loaded from bert.encoder.blocks.11.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:55:01,830.830 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.bias loaded from bert.encoder.blocks.11.mlp.fc2.bias of shape (768,) 2022-03-17 13:55:01,830.830 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.weight loaded from bert.encoder.blocks.11.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:55:01,830.830 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.bias loaded from bert.encoder.blocks.11.norm1.bias of shape (768,) 2022-03-17 13:55:01,830.830 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.weight loaded from bert.encoder.blocks.11.norm1.weight of shape (768,) 2022-03-17 13:55:01,830.830 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.bias loaded from bert.encoder.blocks.11.norm2.bias of shape (768,) 2022-03-17 13:55:01,830.830 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.weight loaded from bert.encoder.blocks.11.norm2.weight of shape (768,) 2022-03-17 13:55:01,830.830 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.bias loaded from bert.encoder.blocks.2.attn.proj.bias of shape (768,) 2022-03-17 13:55:01,830.830 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.weight loaded from bert.encoder.blocks.2.attn.proj.weight of shape (768, 768) 2022-03-17 13:55:01,830.830 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.bias loaded from bert.encoder.blocks.2.attn.qkv.bias of shape (2304,) 2022-03-17 13:55:01,831.831 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.weight loaded from bert.encoder.blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:55:01,831.831 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.bias loaded from bert.encoder.blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-17 13:55:01,831.831 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.weight loaded from bert.encoder.blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:55:01,831.831 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.bias loaded from bert.encoder.blocks.2.mlp.fc2.bias of shape (768,) 2022-03-17 13:55:01,831.831 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.weight loaded from bert.encoder.blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:55:01,831.831 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.bias loaded from bert.encoder.blocks.2.norm1.bias of shape (768,) 2022-03-17 13:55:01,831.831 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.weight loaded from bert.encoder.blocks.2.norm1.weight of shape (768,) 2022-03-17 13:55:01,831.831 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.bias loaded from bert.encoder.blocks.2.norm2.bias of shape (768,) 2022-03-17 13:55:01,831.831 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.weight loaded from bert.encoder.blocks.2.norm2.weight of shape (768,) 2022-03-17 13:55:01,831.831 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.bias loaded from bert.encoder.blocks.3.attn.proj.bias of shape (768,) 2022-03-17 13:55:01,831.831 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.weight loaded from bert.encoder.blocks.3.attn.proj.weight of shape (768, 768) 2022-03-17 13:55:01,831.831 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.bias loaded from bert.encoder.blocks.3.attn.qkv.bias of shape (2304,) 2022-03-17 13:55:01,831.831 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.weight loaded from bert.encoder.blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:55:01,831.831 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.bias loaded from bert.encoder.blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-17 13:55:01,832.832 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.weight loaded from bert.encoder.blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:55:01,832.832 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.bias loaded from bert.encoder.blocks.3.mlp.fc2.bias of shape (768,) 2022-03-17 13:55:01,832.832 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.weight loaded from bert.encoder.blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:55:01,832.832 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.bias loaded from bert.encoder.blocks.3.norm1.bias of shape (768,) 2022-03-17 13:55:01,832.832 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.weight loaded from bert.encoder.blocks.3.norm1.weight of shape (768,) 2022-03-17 13:55:01,832.832 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.bias loaded from bert.encoder.blocks.3.norm2.bias of shape (768,) 2022-03-17 13:55:01,832.832 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.weight loaded from bert.encoder.blocks.3.norm2.weight of shape (768,) 2022-03-17 13:55:01,832.832 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.bias loaded from bert.encoder.blocks.4.attn.proj.bias of shape (768,) 2022-03-17 13:55:01,832.832 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.weight loaded from bert.encoder.blocks.4.attn.proj.weight of shape (768, 768) 2022-03-17 13:55:01,832.832 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.bias loaded from bert.encoder.blocks.4.attn.qkv.bias of shape (2304,) 2022-03-17 13:55:01,832.832 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.weight loaded from bert.encoder.blocks.4.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:55:01,832.832 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.bias loaded from bert.encoder.blocks.4.mlp.fc1.bias of shape (3072,) 2022-03-17 13:55:01,832.832 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.weight loaded from bert.encoder.blocks.4.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:55:01,833.833 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.bias loaded from bert.encoder.blocks.4.mlp.fc2.bias of shape (768,) 2022-03-17 13:55:01,833.833 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.weight loaded from bert.encoder.blocks.4.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:55:01,833.833 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.bias loaded from bert.encoder.blocks.4.norm1.bias of shape (768,) 2022-03-17 13:55:01,833.833 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.weight loaded from bert.encoder.blocks.4.norm1.weight of shape (768,) 2022-03-17 13:55:01,833.833 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.bias loaded from bert.encoder.blocks.4.norm2.bias of shape (768,) 2022-03-17 13:55:01,833.833 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.weight loaded from bert.encoder.blocks.4.norm2.weight of shape (768,) 2022-03-17 13:55:01,833.833 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.bias loaded from bert.encoder.blocks.5.attn.proj.bias of shape (768,) 2022-03-17 13:55:01,833.833 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.weight loaded from bert.encoder.blocks.5.attn.proj.weight of shape (768, 768) 2022-03-17 13:55:01,833.833 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.bias loaded from bert.encoder.blocks.5.attn.qkv.bias of shape (2304,) 2022-03-17 13:55:01,833.833 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.weight loaded from bert.encoder.blocks.5.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:55:01,833.833 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.bias loaded from bert.encoder.blocks.5.mlp.fc1.bias of shape (3072,) 2022-03-17 13:55:01,833.833 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.weight loaded from bert.encoder.blocks.5.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:55:01,833.833 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.bias loaded from bert.encoder.blocks.5.mlp.fc2.bias of shape (768,) 2022-03-17 13:55:01,833.833 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.weight loaded from bert.encoder.blocks.5.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:55:01,834.834 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.bias loaded from bert.encoder.blocks.5.norm1.bias of shape (768,) 2022-03-17 13:55:01,834.834 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.weight loaded from bert.encoder.blocks.5.norm1.weight of shape (768,) 2022-03-17 13:55:01,834.834 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.bias loaded from bert.encoder.blocks.5.norm2.bias of shape (768,) 2022-03-17 13:55:01,834.834 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.weight loaded from bert.encoder.blocks.5.norm2.weight of shape (768,) 2022-03-17 13:55:01,834.834 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.bias loaded from bert.encoder.blocks.6.attn.proj.bias of shape (768,) 2022-03-17 13:55:01,834.834 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.weight loaded from bert.encoder.blocks.6.attn.proj.weight of shape (768, 768) 2022-03-17 13:55:01,834.834 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.bias loaded from bert.encoder.blocks.6.attn.qkv.bias of shape (2304,) 2022-03-17 13:55:01,834.834 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.weight loaded from bert.encoder.blocks.6.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:55:01,834.834 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.bias loaded from bert.encoder.blocks.6.mlp.fc1.bias of shape (3072,) 2022-03-17 13:55:01,834.834 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.weight loaded from bert.encoder.blocks.6.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:55:01,834.834 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.bias loaded from bert.encoder.blocks.6.mlp.fc2.bias of shape (768,) 2022-03-17 13:55:01,834.834 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.weight loaded from bert.encoder.blocks.6.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:55:01,834.834 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.bias loaded from bert.encoder.blocks.6.norm1.bias of shape (768,) 2022-03-17 13:55:01,834.834 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.weight loaded from bert.encoder.blocks.6.norm1.weight of shape (768,) 2022-03-17 13:55:01,835.835 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.bias loaded from bert.encoder.blocks.6.norm2.bias of shape (768,) 2022-03-17 13:55:01,835.835 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.weight loaded from bert.encoder.blocks.6.norm2.weight of shape (768,) 2022-03-17 13:55:01,835.835 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.bias loaded from bert.encoder.blocks.7.attn.proj.bias of shape (768,) 2022-03-17 13:55:01,835.835 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.weight loaded from bert.encoder.blocks.7.attn.proj.weight of shape (768, 768) 2022-03-17 13:55:01,835.835 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.bias loaded from bert.encoder.blocks.7.attn.qkv.bias of shape (2304,) 2022-03-17 13:55:01,835.835 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.weight loaded from bert.encoder.blocks.7.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:55:01,835.835 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.bias loaded from bert.encoder.blocks.7.mlp.fc1.bias of shape (3072,) 2022-03-17 13:55:01,835.835 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.weight loaded from bert.encoder.blocks.7.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:55:01,835.835 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.bias loaded from bert.encoder.blocks.7.mlp.fc2.bias of shape (768,) 2022-03-17 13:55:01,835.835 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.weight loaded from bert.encoder.blocks.7.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:55:01,835.835 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.bias loaded from bert.encoder.blocks.7.norm1.bias of shape (768,) 2022-03-17 13:55:01,835.835 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.weight loaded from bert.encoder.blocks.7.norm1.weight of shape (768,) 2022-03-17 13:55:01,835.835 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.bias loaded from bert.encoder.blocks.7.norm2.bias of shape (768,) 2022-03-17 13:55:01,835.835 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.weight loaded from bert.encoder.blocks.7.norm2.weight of shape (768,) 2022-03-17 13:55:01,836.836 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.bias loaded from bert.encoder.blocks.8.attn.proj.bias of shape (768,) 2022-03-17 13:55:01,836.836 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.weight loaded from bert.encoder.blocks.8.attn.proj.weight of shape (768, 768) 2022-03-17 13:55:01,836.836 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.bias loaded from bert.encoder.blocks.8.attn.qkv.bias of shape (2304,) 2022-03-17 13:55:01,836.836 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.weight loaded from bert.encoder.blocks.8.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:55:01,836.836 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.bias loaded from bert.encoder.blocks.8.mlp.fc1.bias of shape (3072,) 2022-03-17 13:55:01,836.836 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.weight loaded from bert.encoder.blocks.8.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:55:01,836.836 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.bias loaded from bert.encoder.blocks.8.mlp.fc2.bias of shape (768,) 2022-03-17 13:55:01,836.836 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.weight loaded from bert.encoder.blocks.8.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:55:01,836.836 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.bias loaded from bert.encoder.blocks.8.norm1.bias of shape (768,) 2022-03-17 13:55:01,836.836 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.weight loaded from bert.encoder.blocks.8.norm1.weight of shape (768,) 2022-03-17 13:55:01,836.836 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.bias loaded from bert.encoder.blocks.8.norm2.bias of shape (768,) 2022-03-17 13:55:01,836.836 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.weight loaded from bert.encoder.blocks.8.norm2.weight of shape (768,) 2022-03-17 13:55:01,836.836 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.bias loaded from bert.encoder.blocks.9.attn.proj.bias of shape (768,) 2022-03-17 13:55:01,837.837 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.weight loaded from bert.encoder.blocks.9.attn.proj.weight of shape (768, 768) 2022-03-17 13:55:01,837.837 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.bias loaded from bert.encoder.blocks.9.attn.qkv.bias of shape (2304,) 2022-03-17 13:55:01,837.837 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.weight loaded from bert.encoder.blocks.9.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:55:01,837.837 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.bias loaded from bert.encoder.blocks.9.mlp.fc1.bias of shape (3072,) 2022-03-17 13:55:01,837.837 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.weight loaded from bert.encoder.blocks.9.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:55:01,837.837 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.bias loaded from bert.encoder.blocks.9.mlp.fc2.bias of shape (768,) 2022-03-17 13:55:01,837.837 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.weight loaded from bert.encoder.blocks.9.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:55:01,837.837 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.bias loaded from bert.encoder.blocks.9.norm1.bias of shape (768,) 2022-03-17 13:55:01,837.837 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.weight loaded from bert.encoder.blocks.9.norm1.weight of shape (768,) 2022-03-17 13:55:01,837.837 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.bias loaded from bert.encoder.blocks.9.norm2.bias of shape (768,) 2022-03-17 13:55:01,837.837 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.weight loaded from bert.encoder.blocks.9.norm2.weight of shape (768,) 2022-03-17 13:55:01,837.837 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.bias loaded from bert.encoder.tag_blocks.0.attn.proj.bias of shape (768,) 2022-03-17 13:55:01,837.837 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.weight loaded from bert.encoder.tag_blocks.0.attn.proj.weight of shape (768, 768) 2022-03-17 13:55:01,837.837 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.bias loaded from bert.encoder.tag_blocks.0.attn.qkv.bias of shape (2304,) 2022-03-17 13:55:01,838.838 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.weight loaded from bert.encoder.tag_blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:55:01,838.838 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.bias loaded from bert.encoder.tag_blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-17 13:55:01,838.838 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.weight loaded from bert.encoder.tag_blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:55:01,838.838 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.bias loaded from bert.encoder.tag_blocks.0.mlp.fc2.bias of shape (768,) 2022-03-17 13:55:01,838.838 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.weight loaded from bert.encoder.tag_blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:55:01,838.838 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.bias loaded from bert.encoder.tag_blocks.0.norm1.bias of shape (768,) 2022-03-17 13:55:01,838.838 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.weight loaded from bert.encoder.tag_blocks.0.norm1.weight of shape (768,) 2022-03-17 13:55:01,838.838 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.bias loaded from bert.encoder.tag_blocks.0.norm2.bias of shape (768,) 2022-03-17 13:55:01,838.838 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.weight loaded from bert.encoder.tag_blocks.0.norm2.weight of shape (768,) 2022-03-17 13:55:01,838.838 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.bias loaded from bert.encoder.tag_blocks.1.attn.proj.bias of shape (768,) 2022-03-17 13:55:01,838.838 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.weight loaded from bert.encoder.tag_blocks.1.attn.proj.weight of shape (768, 768) 2022-03-17 13:55:01,838.838 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.bias loaded from bert.encoder.tag_blocks.1.attn.qkv.bias of shape (2304,) 2022-03-17 13:55:01,838.838 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.weight loaded from bert.encoder.tag_blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:55:01,838.838 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.bias loaded from bert.encoder.tag_blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-17 13:55:01,839.839 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.weight loaded from bert.encoder.tag_blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:55:01,839.839 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.bias loaded from bert.encoder.tag_blocks.1.mlp.fc2.bias of shape (768,) 2022-03-17 13:55:01,839.839 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.weight loaded from bert.encoder.tag_blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:55:01,839.839 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.bias loaded from bert.encoder.tag_blocks.1.norm1.bias of shape (768,) 2022-03-17 13:55:01,839.839 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.weight loaded from bert.encoder.tag_blocks.1.norm1.weight of shape (768,) 2022-03-17 13:55:01,839.839 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.bias loaded from bert.encoder.tag_blocks.1.norm2.bias of shape (768,) 2022-03-17 13:55:01,839.839 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.weight loaded from bert.encoder.tag_blocks.1.norm2.weight of shape (768,) 2022-03-17 13:55:01,839.839 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.bias loaded from bert.encoder.tag_blocks.2.attn.proj.bias of shape (768,) 2022-03-17 13:55:01,839.839 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.weight loaded from bert.encoder.tag_blocks.2.attn.proj.weight of shape (768, 768) 2022-03-17 13:55:01,839.839 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.bias loaded from bert.encoder.tag_blocks.2.attn.qkv.bias of shape (2304,) 2022-03-17 13:55:01,839.839 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.weight loaded from bert.encoder.tag_blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:55:01,839.839 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.bias loaded from bert.encoder.tag_blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-17 13:55:01,839.839 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.weight loaded from bert.encoder.tag_blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:55:01,839.839 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.bias loaded from bert.encoder.tag_blocks.2.mlp.fc2.bias of shape (768,) 2022-03-17 13:55:01,840.840 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.weight loaded from bert.encoder.tag_blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:55:01,840.840 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.bias loaded from bert.encoder.tag_blocks.2.norm1.bias of shape (768,) 2022-03-17 13:55:01,840.840 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.weight loaded from bert.encoder.tag_blocks.2.norm1.weight of shape (768,) 2022-03-17 13:55:01,840.840 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.bias loaded from bert.encoder.tag_blocks.2.norm2.bias of shape (768,) 2022-03-17 13:55:01,840.840 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.weight loaded from bert.encoder.tag_blocks.2.norm2.weight of shape (768,) 2022-03-17 13:55:01,840.840 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.bias loaded from bert.encoder.tag_blocks.3.attn.proj.bias of shape (768,) 2022-03-17 13:55:01,840.840 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.weight loaded from bert.encoder.tag_blocks.3.attn.proj.weight of shape (768, 768) 2022-03-17 13:55:01,840.840 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.bias loaded from bert.encoder.tag_blocks.3.attn.qkv.bias of shape (2304,) 2022-03-17 13:55:01,840.840 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.weight loaded from bert.encoder.tag_blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:55:01,840.840 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.bias loaded from bert.encoder.tag_blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-17 13:55:01,840.840 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.weight loaded from bert.encoder.tag_blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:55:01,840.840 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.bias loaded from bert.encoder.tag_blocks.3.mlp.fc2.bias of shape (768,) 2022-03-17 13:55:01,840.840 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.weight loaded from bert.encoder.tag_blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:55:01,841.841 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.bias loaded from bert.encoder.tag_blocks.3.norm1.bias of shape (768,) 2022-03-17 13:55:01,841.841 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.weight loaded from bert.encoder.tag_blocks.3.norm1.weight of shape (768,) 2022-03-17 13:55:01,841.841 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.bias loaded from bert.encoder.tag_blocks.3.norm2.bias of shape (768,) 2022-03-17 13:55:01,841.841 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.weight loaded from bert.encoder.tag_blocks.3.norm2.weight of shape (768,) 2022-03-17 13:55:01,841.841 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.bias loaded from bert.extra_embeddings.LayerNorm.bias of shape (768,) 2022-03-17 13:55:01,841.841 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.weight loaded from bert.extra_embeddings.LayerNorm.weight of shape (768,) 2022-03-17 13:55:01,841.841 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.position_embeddings.weight loaded from bert.extra_embeddings.position_embeddings.weight of shape (512, 768) 2022-03-17 13:55:01,841.841 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.token_type_embeddings.weight loaded from bert.extra_embeddings.token_type_embeddings.weight of shape (2, 768) 2022-03-17 13:55:01,841.841 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.word_embeddings.weight loaded from bert.extra_embeddings.word_embeddings.weight of shape (30522, 768) 2022-03-17 13:55:01,841.841 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.bias loaded from bert.pooler.dense.bias of shape (768,) 2022-03-17 13:55:01,841.841 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.weight loaded from bert.pooler.dense.weight of shape (768, 768) 2022-03-17 13:55:01,841.841 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.bias loaded from bert.tag_logit.predictions.bias of shape (30522,) 2022-03-17 13:55:01,842.842 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.decoder.weight loaded from bert.tag_logit.predictions.decoder.weight of shape (30522, 768) 2022-03-17 13:55:01,842.842 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.bias loaded from bert.tag_logit.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-17 13:55:01,842.842 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.weight loaded from bert.tag_logit.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-17 13:55:01,842.842 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.bias loaded from bert.tag_logit.predictions.transform.dense.bias of shape (768,) 2022-03-17 13:55:01,842.842 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.weight loaded from bert.tag_logit.predictions.transform.dense.weight of shape (768, 768) 2022-03-17 13:55:01,842.842 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.bias loaded from cls.predictions.bias of shape (30522,) 2022-03-17 13:55:01,842.842 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.decoder.weight loaded from cls.predictions.decoder.weight of shape (30522, 768) 2022-03-17 13:55:01,842.842 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.bias loaded from cls.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-17 13:55:01,842.842 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.weight loaded from cls.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-17 13:55:01,842.842 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.bias loaded from cls.predictions.transform.dense.bias of shape (768,) 2022-03-17 13:55:01,842.842 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.weight loaded from cls.predictions.transform.dense.weight of shape (768, 768) 2022-03-17 13:55:01,842.842 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 288; loaded = 288 2022-03-17 13:55:01,843.843 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = [] 2022-03-17 13:55:01,846.846 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = ['module.cls.predictions.decoder.weight']; total = 1 2022-03-17 13:55:02,008.008 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.cls.predictions.decoder.weight'] 2022-03-17 13:55:02,331.331 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = ; data = TaxCocoCaption; split = test; add_key = False; backend = cv; hold_buffer = 0; save_original = False 2022-03-17 13:55:02,357.357 2829:uni_pipeline.py:509 get_data_loader(): sampler = 2022-03-17 13:55:02,358.358 2829:uni_pipeline.py:1000 predict(): writing output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0045000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv 2022-03-17 13:55:02,431.431 2829:uni_pipeline.py:907 predict_iter(): DatasetPlusTransform(dataset=, transform=Compose( Compose( ImageTransform2Dict(image_transform=Compose( ToPILImage() Resize(size=384, interpolation=PIL.Image.BICUBIC) CenterCrop(size=(384, 384)) ToTensor() Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) )) ) LoadLabel(data=TaxCocoCaption, split=test, version=vinvl) TransCaptionTensorizer(tensorizer=, pad_to_max=True, pad_image_to_max=True) )) uni_pipeline.py:908: 0%| | 0/10 [00:00 locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi pip.monitor_train() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train need_wait_models = self.pred_eval_intermediate_models() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models pred = self.ensure_predict(model_file=model_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict self.predict(model_file, predict_result_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict model = self.get_model(is_train=False) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. --- Logging error --- Traceback (most recent call last): File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit msg = self.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format return fmt.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format record.message = record.getMessage() File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage msg = str(self.msg) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__ return str(self.to_json_string()) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n" File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps **kw).encode(obj) File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode chunks = list(chunks) File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode yield from _iterencode_dict(o, _current_indent_level) File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict yield from chunks File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode o = _default(o) File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default raise TypeError(f'Object of type {o.__class__.__name__} ' TypeError: Object of type BertTokenizer is not JSON serializable Call stack: File "src/qd/pipeline.py", line 1368, in locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi pip.monitor_train() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train need_wait_models = self.pred_eval_intermediate_models() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models pred = self.ensure_predict(model_file=model_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict self.predict(model_file, predict_result_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict model = self.get_model(is_train=False) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. 2022-03-17 13:57:15,241.241 2829:modeling_bert.py:529 __init__(): TIMM Split image encoder load from pre-trained: True 2022-03-17 13:57:16,308.308 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:57:17,851.851 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:57:18,503.503 2829:modeling_bert.py:2677 __init__(): BertImgModel Image Dimension: 2054 2022-03-17 13:57:18,651.651 2829:modeling_bert.py:2688 __init__(): BertImgModel Image Dimension: 2054 2022-03-17 13:57:19,440.440 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight! Note that this might be replaced by pre-trained checkpoint later! 2022-03-17 13:57:19,440.440 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode. 2022-03-17 13:57:20,571.571 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:57:21,104.104 2829:checkpoint.py:240 load(): Loading checkpoint from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt 2022-03-17 13:57:29,279.279 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.cls_token loaded from image_encoder.module.cls_token of shape (1, 1, 768) 2022-03-17 13:57:29,279.279 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.bias loaded from image_encoder.module.head.bias of shape (1000,) 2022-03-17 13:57:29,280.280 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.weight loaded from image_encoder.module.head.weight of shape (1000, 768) 2022-03-17 13:57:29,280.280 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.bias loaded from image_encoder.module.patch_embed.proj.bias of shape (768,) 2022-03-17 13:57:29,280.280 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.weight loaded from image_encoder.module.patch_embed.proj.weight of shape (768, 3, 16, 16) 2022-03-17 13:57:29,280.280 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.pos_embed loaded from image_encoder.module.pos_embed of shape (1, 577, 768) 2022-03-17 13:57:29,280.280 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.bias loaded from bert.caption_pooler.dense.bias of shape (768,) 2022-03-17 13:57:29,280.280 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.weight loaded from bert.caption_pooler.dense.weight of shape (768, 768) 2022-03-17 13:57:29,280.280 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.bias loaded from bert.decoder.layer.0.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:57:29,280.280 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.weight loaded from bert.decoder.layer.0.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:57:29,280.280 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.bias loaded from bert.decoder.layer.0.attention.output.dense.bias of shape (768,) 2022-03-17 13:57:29,280.280 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.weight loaded from bert.decoder.layer.0.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:57:29,280.280 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.bias loaded from bert.decoder.layer.0.attention.self.key.bias of shape (768,) 2022-03-17 13:57:29,281.281 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.weight loaded from bert.decoder.layer.0.attention.self.key.weight of shape (768, 768) 2022-03-17 13:57:29,281.281 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.bias loaded from bert.decoder.layer.0.attention.self.query.bias of shape (768,) 2022-03-17 13:57:29,281.281 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.weight loaded from bert.decoder.layer.0.attention.self.query.weight of shape (768, 768) 2022-03-17 13:57:29,281.281 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.bias loaded from bert.decoder.layer.0.attention.self.value.bias of shape (768,) 2022-03-17 13:57:29,281.281 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.weight loaded from bert.decoder.layer.0.attention.self.value.weight of shape (768, 768) 2022-03-17 13:57:29,281.281 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.bias loaded from bert.decoder.layer.0.intermediate.dense.bias of shape (3072,) 2022-03-17 13:57:29,281.281 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.weight loaded from bert.decoder.layer.0.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:57:29,281.281 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.bias loaded from bert.decoder.layer.0.output.LayerNorm.bias of shape (768,) 2022-03-17 13:57:29,281.281 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.weight loaded from bert.decoder.layer.0.output.LayerNorm.weight of shape (768,) 2022-03-17 13:57:29,281.281 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.bias loaded from bert.decoder.layer.0.output.dense.bias of shape (768,) 2022-03-17 13:57:29,281.281 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.weight loaded from bert.decoder.layer.0.output.dense.weight of shape (768, 3072) 2022-03-17 13:57:29,281.281 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.bias loaded from bert.decoder.layer.1.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:57:29,281.281 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.weight loaded from bert.decoder.layer.1.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:57:29,281.281 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.bias loaded from bert.decoder.layer.1.attention.output.dense.bias of shape (768,) 2022-03-17 13:57:29,282.282 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.weight loaded from bert.decoder.layer.1.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:57:29,282.282 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.bias loaded from bert.decoder.layer.1.attention.self.key.bias of shape (768,) 2022-03-17 13:57:29,282.282 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.weight loaded from bert.decoder.layer.1.attention.self.key.weight of shape (768, 768) 2022-03-17 13:57:29,282.282 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.bias loaded from bert.decoder.layer.1.attention.self.query.bias of shape (768,) 2022-03-17 13:57:29,282.282 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.weight loaded from bert.decoder.layer.1.attention.self.query.weight of shape (768, 768) 2022-03-17 13:57:29,282.282 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.bias loaded from bert.decoder.layer.1.attention.self.value.bias of shape (768,) 2022-03-17 13:57:29,282.282 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.weight loaded from bert.decoder.layer.1.attention.self.value.weight of shape (768, 768) 2022-03-17 13:57:29,282.282 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.bias loaded from bert.decoder.layer.1.intermediate.dense.bias of shape (3072,) 2022-03-17 13:57:29,282.282 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.weight loaded from bert.decoder.layer.1.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:57:29,282.282 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.bias loaded from bert.decoder.layer.1.output.LayerNorm.bias of shape (768,) 2022-03-17 13:57:29,282.282 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.weight loaded from bert.decoder.layer.1.output.LayerNorm.weight of shape (768,) 2022-03-17 13:57:29,282.282 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.bias loaded from bert.decoder.layer.1.output.dense.bias of shape (768,) 2022-03-17 13:57:29,282.282 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.weight loaded from bert.decoder.layer.1.output.dense.weight of shape (768, 3072) 2022-03-17 13:57:29,283.283 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.bias loaded from bert.decoder.layer.2.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:57:29,283.283 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.weight loaded from bert.decoder.layer.2.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:57:29,283.283 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.bias loaded from bert.decoder.layer.2.attention.output.dense.bias of shape (768,) 2022-03-17 13:57:29,283.283 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.weight loaded from bert.decoder.layer.2.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:57:29,283.283 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.bias loaded from bert.decoder.layer.2.attention.self.key.bias of shape (768,) 2022-03-17 13:57:29,283.283 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.weight loaded from bert.decoder.layer.2.attention.self.key.weight of shape (768, 768) 2022-03-17 13:57:29,283.283 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.bias loaded from bert.decoder.layer.2.attention.self.query.bias of shape (768,) 2022-03-17 13:57:29,283.283 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.weight loaded from bert.decoder.layer.2.attention.self.query.weight of shape (768, 768) 2022-03-17 13:57:29,283.283 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.bias loaded from bert.decoder.layer.2.attention.self.value.bias of shape (768,) 2022-03-17 13:57:29,283.283 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.weight loaded from bert.decoder.layer.2.attention.self.value.weight of shape (768, 768) 2022-03-17 13:57:29,283.283 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.bias loaded from bert.decoder.layer.2.intermediate.dense.bias of shape (3072,) 2022-03-17 13:57:29,283.283 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.weight loaded from bert.decoder.layer.2.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:57:29,283.283 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.bias loaded from bert.decoder.layer.2.output.LayerNorm.bias of shape (768,) 2022-03-17 13:57:29,284.284 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.weight loaded from bert.decoder.layer.2.output.LayerNorm.weight of shape (768,) 2022-03-17 13:57:29,284.284 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.bias loaded from bert.decoder.layer.2.output.dense.bias of shape (768,) 2022-03-17 13:57:29,284.284 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.weight loaded from bert.decoder.layer.2.output.dense.weight of shape (768, 3072) 2022-03-17 13:57:29,284.284 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.bias loaded from bert.decoder.layer.3.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:57:29,284.284 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.weight loaded from bert.decoder.layer.3.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:57:29,284.284 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.bias loaded from bert.decoder.layer.3.attention.output.dense.bias of shape (768,) 2022-03-17 13:57:29,284.284 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.weight loaded from bert.decoder.layer.3.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:57:29,284.284 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.bias loaded from bert.decoder.layer.3.attention.self.key.bias of shape (768,) 2022-03-17 13:57:29,284.284 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.weight loaded from bert.decoder.layer.3.attention.self.key.weight of shape (768, 768) 2022-03-17 13:57:29,284.284 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.bias loaded from bert.decoder.layer.3.attention.self.query.bias of shape (768,) 2022-03-17 13:57:29,284.284 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.weight loaded from bert.decoder.layer.3.attention.self.query.weight of shape (768, 768) 2022-03-17 13:57:29,284.284 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.bias loaded from bert.decoder.layer.3.attention.self.value.bias of shape (768,) 2022-03-17 13:57:29,284.284 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.weight loaded from bert.decoder.layer.3.attention.self.value.weight of shape (768, 768) 2022-03-17 13:57:29,284.284 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.bias loaded from bert.decoder.layer.3.intermediate.dense.bias of shape (3072,) 2022-03-17 13:57:29,285.285 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.weight loaded from bert.decoder.layer.3.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:57:29,285.285 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.bias loaded from bert.decoder.layer.3.output.LayerNorm.bias of shape (768,) 2022-03-17 13:57:29,285.285 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.weight loaded from bert.decoder.layer.3.output.LayerNorm.weight of shape (768,) 2022-03-17 13:57:29,285.285 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.bias loaded from bert.decoder.layer.3.output.dense.bias of shape (768,) 2022-03-17 13:57:29,285.285 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.weight loaded from bert.decoder.layer.3.output.dense.weight of shape (768, 3072) 2022-03-17 13:57:29,285.285 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.bias loaded from bert.embeddings.LayerNorm.bias of shape (768,) 2022-03-17 13:57:29,285.285 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.weight loaded from bert.embeddings.LayerNorm.weight of shape (768,) 2022-03-17 13:57:29,285.285 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.position_embeddings.weight loaded from bert.embeddings.position_embeddings.weight of shape (512, 768) 2022-03-17 13:57:29,285.285 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.token_type_embeddings.weight loaded from bert.embeddings.token_type_embeddings.weight of shape (2, 768) 2022-03-17 13:57:29,285.285 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.word_embeddings.weight loaded from bert.embeddings.word_embeddings.weight of shape (30522, 768) 2022-03-17 13:57:29,285.285 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.bias loaded from bert.encoder.blocks.0.attn.proj.bias of shape (768,) 2022-03-17 13:57:29,285.285 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.weight loaded from bert.encoder.blocks.0.attn.proj.weight of shape (768, 768) 2022-03-17 13:57:29,285.285 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.bias loaded from bert.encoder.blocks.0.attn.qkv.bias of shape (2304,) 2022-03-17 13:57:29,285.285 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.weight loaded from bert.encoder.blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:57:29,286.286 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.bias loaded from bert.encoder.blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-17 13:57:29,286.286 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.weight loaded from bert.encoder.blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:57:29,286.286 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.bias loaded from bert.encoder.blocks.0.mlp.fc2.bias of shape (768,) 2022-03-17 13:57:29,286.286 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.weight loaded from bert.encoder.blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:57:29,286.286 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.bias loaded from bert.encoder.blocks.0.norm1.bias of shape (768,) 2022-03-17 13:57:29,286.286 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.weight loaded from bert.encoder.blocks.0.norm1.weight of shape (768,) 2022-03-17 13:57:29,286.286 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.bias loaded from bert.encoder.blocks.0.norm2.bias of shape (768,) 2022-03-17 13:57:29,286.286 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.weight loaded from bert.encoder.blocks.0.norm2.weight of shape (768,) 2022-03-17 13:57:29,286.286 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.bias loaded from bert.encoder.blocks.1.attn.proj.bias of shape (768,) 2022-03-17 13:57:29,286.286 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.weight loaded from bert.encoder.blocks.1.attn.proj.weight of shape (768, 768) 2022-03-17 13:57:29,286.286 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.bias loaded from bert.encoder.blocks.1.attn.qkv.bias of shape (2304,) 2022-03-17 13:57:29,286.286 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.weight loaded from bert.encoder.blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:57:29,286.286 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.bias loaded from bert.encoder.blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-17 13:57:29,287.287 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.weight loaded from bert.encoder.blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:57:29,287.287 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.bias loaded from bert.encoder.blocks.1.mlp.fc2.bias of shape (768,) 2022-03-17 13:57:29,287.287 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.weight loaded from bert.encoder.blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:57:29,287.287 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.bias loaded from bert.encoder.blocks.1.norm1.bias of shape (768,) 2022-03-17 13:57:29,287.287 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.weight loaded from bert.encoder.blocks.1.norm1.weight of shape (768,) 2022-03-17 13:57:29,287.287 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.bias loaded from bert.encoder.blocks.1.norm2.bias of shape (768,) 2022-03-17 13:57:29,287.287 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.weight loaded from bert.encoder.blocks.1.norm2.weight of shape (768,) 2022-03-17 13:57:29,287.287 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.bias loaded from bert.encoder.blocks.10.attn.proj.bias of shape (768,) 2022-03-17 13:57:29,287.287 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.weight loaded from bert.encoder.blocks.10.attn.proj.weight of shape (768, 768) 2022-03-17 13:57:29,287.287 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.bias loaded from bert.encoder.blocks.10.attn.qkv.bias of shape (2304,) 2022-03-17 13:57:29,287.287 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.weight loaded from bert.encoder.blocks.10.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:57:29,287.287 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.bias loaded from bert.encoder.blocks.10.mlp.fc1.bias of shape (3072,) 2022-03-17 13:57:29,287.287 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.weight loaded from bert.encoder.blocks.10.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:57:29,287.287 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.bias loaded from bert.encoder.blocks.10.mlp.fc2.bias of shape (768,) 2022-03-17 13:57:29,288.288 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.weight loaded from bert.encoder.blocks.10.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:57:29,288.288 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.bias loaded from bert.encoder.blocks.10.norm1.bias of shape (768,) 2022-03-17 13:57:29,288.288 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.weight loaded from bert.encoder.blocks.10.norm1.weight of shape (768,) 2022-03-17 13:57:29,288.288 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.bias loaded from bert.encoder.blocks.10.norm2.bias of shape (768,) 2022-03-17 13:57:29,288.288 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.weight loaded from bert.encoder.blocks.10.norm2.weight of shape (768,) 2022-03-17 13:57:29,288.288 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.bias loaded from bert.encoder.blocks.11.attn.proj.bias of shape (768,) 2022-03-17 13:57:29,288.288 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.weight loaded from bert.encoder.blocks.11.attn.proj.weight of shape (768, 768) 2022-03-17 13:57:29,288.288 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.bias loaded from bert.encoder.blocks.11.attn.qkv.bias of shape (2304,) 2022-03-17 13:57:29,288.288 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.weight loaded from bert.encoder.blocks.11.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:57:29,288.288 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.bias loaded from bert.encoder.blocks.11.mlp.fc1.bias of shape (3072,) 2022-03-17 13:57:29,288.288 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.weight loaded from bert.encoder.blocks.11.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:57:29,288.288 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.bias loaded from bert.encoder.blocks.11.mlp.fc2.bias of shape (768,) 2022-03-17 13:57:29,288.288 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.weight loaded from bert.encoder.blocks.11.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:57:29,288.288 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.bias loaded from bert.encoder.blocks.11.norm1.bias of shape (768,) 2022-03-17 13:57:29,289.289 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.weight loaded from bert.encoder.blocks.11.norm1.weight of shape (768,) 2022-03-17 13:57:29,289.289 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.bias loaded from bert.encoder.blocks.11.norm2.bias of shape (768,) 2022-03-17 13:57:29,289.289 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.weight loaded from bert.encoder.blocks.11.norm2.weight of shape (768,) 2022-03-17 13:57:29,289.289 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.bias loaded from bert.encoder.blocks.2.attn.proj.bias of shape (768,) 2022-03-17 13:57:29,289.289 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.weight loaded from bert.encoder.blocks.2.attn.proj.weight of shape (768, 768) 2022-03-17 13:57:29,289.289 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.bias loaded from bert.encoder.blocks.2.attn.qkv.bias of shape (2304,) 2022-03-17 13:57:29,289.289 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.weight loaded from bert.encoder.blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:57:29,289.289 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.bias loaded from bert.encoder.blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-17 13:57:29,289.289 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.weight loaded from bert.encoder.blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:57:29,289.289 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.bias loaded from bert.encoder.blocks.2.mlp.fc2.bias of shape (768,) 2022-03-17 13:57:29,289.289 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.weight loaded from bert.encoder.blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:57:29,289.289 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.bias loaded from bert.encoder.blocks.2.norm1.bias of shape (768,) 2022-03-17 13:57:29,289.289 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.weight loaded from bert.encoder.blocks.2.norm1.weight of shape (768,) 2022-03-17 13:57:29,289.289 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.bias loaded from bert.encoder.blocks.2.norm2.bias of shape (768,) 2022-03-17 13:57:29,290.290 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.weight loaded from bert.encoder.blocks.2.norm2.weight of shape (768,) 2022-03-17 13:57:29,290.290 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.bias loaded from bert.encoder.blocks.3.attn.proj.bias of shape (768,) 2022-03-17 13:57:29,290.290 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.weight loaded from bert.encoder.blocks.3.attn.proj.weight of shape (768, 768) 2022-03-17 13:57:29,290.290 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.bias loaded from bert.encoder.blocks.3.attn.qkv.bias of shape (2304,) 2022-03-17 13:57:29,290.290 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.weight loaded from bert.encoder.blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:57:29,290.290 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.bias loaded from bert.encoder.blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-17 13:57:29,290.290 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.weight loaded from bert.encoder.blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:57:29,290.290 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.bias loaded from bert.encoder.blocks.3.mlp.fc2.bias of shape (768,) 2022-03-17 13:57:29,290.290 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.weight loaded from bert.encoder.blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:57:29,290.290 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.bias loaded from bert.encoder.blocks.3.norm1.bias of shape (768,) 2022-03-17 13:57:29,290.290 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.weight loaded from bert.encoder.blocks.3.norm1.weight of shape (768,) 2022-03-17 13:57:29,290.290 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.bias loaded from bert.encoder.blocks.3.norm2.bias of shape (768,) 2022-03-17 13:57:29,291.291 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.weight loaded from bert.encoder.blocks.3.norm2.weight of shape (768,) 2022-03-17 13:57:29,291.291 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.bias loaded from bert.encoder.blocks.4.attn.proj.bias of shape (768,) 2022-03-17 13:57:29,291.291 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.weight loaded from bert.encoder.blocks.4.attn.proj.weight of shape (768, 768) 2022-03-17 13:57:29,291.291 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.bias loaded from bert.encoder.blocks.4.attn.qkv.bias of shape (2304,) 2022-03-17 13:57:29,291.291 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.weight loaded from bert.encoder.blocks.4.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:57:29,291.291 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.bias loaded from bert.encoder.blocks.4.mlp.fc1.bias of shape (3072,) 2022-03-17 13:57:29,291.291 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.weight loaded from bert.encoder.blocks.4.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:57:29,291.291 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.bias loaded from bert.encoder.blocks.4.mlp.fc2.bias of shape (768,) 2022-03-17 13:57:29,291.291 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.weight loaded from bert.encoder.blocks.4.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:57:29,291.291 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.bias loaded from bert.encoder.blocks.4.norm1.bias of shape (768,) 2022-03-17 13:57:29,291.291 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.weight loaded from bert.encoder.blocks.4.norm1.weight of shape (768,) 2022-03-17 13:57:29,291.291 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.bias loaded from bert.encoder.blocks.4.norm2.bias of shape (768,) 2022-03-17 13:57:29,291.291 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.weight loaded from bert.encoder.blocks.4.norm2.weight of shape (768,) 2022-03-17 13:57:29,291.291 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.bias loaded from bert.encoder.blocks.5.attn.proj.bias of shape (768,) 2022-03-17 13:57:29,292.292 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.weight loaded from bert.encoder.blocks.5.attn.proj.weight of shape (768, 768) 2022-03-17 13:57:29,292.292 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.bias loaded from bert.encoder.blocks.5.attn.qkv.bias of shape (2304,) 2022-03-17 13:57:29,292.292 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.weight loaded from bert.encoder.blocks.5.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:57:29,292.292 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.bias loaded from bert.encoder.blocks.5.mlp.fc1.bias of shape (3072,) 2022-03-17 13:57:29,292.292 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.weight loaded from bert.encoder.blocks.5.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:57:29,292.292 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.bias loaded from bert.encoder.blocks.5.mlp.fc2.bias of shape (768,) 2022-03-17 13:57:29,292.292 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.weight loaded from bert.encoder.blocks.5.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:57:29,292.292 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.bias loaded from bert.encoder.blocks.5.norm1.bias of shape (768,) 2022-03-17 13:57:29,292.292 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.weight loaded from bert.encoder.blocks.5.norm1.weight of shape (768,) 2022-03-17 13:57:29,292.292 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.bias loaded from bert.encoder.blocks.5.norm2.bias of shape (768,) 2022-03-17 13:57:29,292.292 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.weight loaded from bert.encoder.blocks.5.norm2.weight of shape (768,) 2022-03-17 13:57:29,292.292 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.bias loaded from bert.encoder.blocks.6.attn.proj.bias of shape (768,) 2022-03-17 13:57:29,293.293 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.weight loaded from bert.encoder.blocks.6.attn.proj.weight of shape (768, 768) 2022-03-17 13:57:29,293.293 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.bias loaded from bert.encoder.blocks.6.attn.qkv.bias of shape (2304,) 2022-03-17 13:57:29,293.293 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.weight loaded from bert.encoder.blocks.6.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:57:29,293.293 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.bias loaded from bert.encoder.blocks.6.mlp.fc1.bias of shape (3072,) 2022-03-17 13:57:29,293.293 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.weight loaded from bert.encoder.blocks.6.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:57:29,293.293 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.bias loaded from bert.encoder.blocks.6.mlp.fc2.bias of shape (768,) 2022-03-17 13:57:29,293.293 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.weight loaded from bert.encoder.blocks.6.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:57:29,293.293 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.bias loaded from bert.encoder.blocks.6.norm1.bias of shape (768,) 2022-03-17 13:57:29,293.293 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.weight loaded from bert.encoder.blocks.6.norm1.weight of shape (768,) 2022-03-17 13:57:29,293.293 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.bias loaded from bert.encoder.blocks.6.norm2.bias of shape (768,) 2022-03-17 13:57:29,293.293 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.weight loaded from bert.encoder.blocks.6.norm2.weight of shape (768,) 2022-03-17 13:57:29,293.293 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.bias loaded from bert.encoder.blocks.7.attn.proj.bias of shape (768,) 2022-03-17 13:57:29,293.293 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.weight loaded from bert.encoder.blocks.7.attn.proj.weight of shape (768, 768) 2022-03-17 13:57:29,293.293 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.bias loaded from bert.encoder.blocks.7.attn.qkv.bias of shape (2304,) 2022-03-17 13:57:29,294.294 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.weight loaded from bert.encoder.blocks.7.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:57:29,294.294 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.bias loaded from bert.encoder.blocks.7.mlp.fc1.bias of shape (3072,) 2022-03-17 13:57:29,294.294 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.weight loaded from bert.encoder.blocks.7.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:57:29,294.294 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.bias loaded from bert.encoder.blocks.7.mlp.fc2.bias of shape (768,) 2022-03-17 13:57:29,294.294 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.weight loaded from bert.encoder.blocks.7.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:57:29,294.294 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.bias loaded from bert.encoder.blocks.7.norm1.bias of shape (768,) 2022-03-17 13:57:29,294.294 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.weight loaded from bert.encoder.blocks.7.norm1.weight of shape (768,) 2022-03-17 13:57:29,294.294 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.bias loaded from bert.encoder.blocks.7.norm2.bias of shape (768,) 2022-03-17 13:57:29,294.294 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.weight loaded from bert.encoder.blocks.7.norm2.weight of shape (768,) 2022-03-17 13:57:29,294.294 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.bias loaded from bert.encoder.blocks.8.attn.proj.bias of shape (768,) 2022-03-17 13:57:29,294.294 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.weight loaded from bert.encoder.blocks.8.attn.proj.weight of shape (768, 768) 2022-03-17 13:57:29,294.294 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.bias loaded from bert.encoder.blocks.8.attn.qkv.bias of shape (2304,) 2022-03-17 13:57:29,294.294 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.weight loaded from bert.encoder.blocks.8.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:57:29,294.294 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.bias loaded from bert.encoder.blocks.8.mlp.fc1.bias of shape (3072,) 2022-03-17 13:57:29,295.295 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.weight loaded from bert.encoder.blocks.8.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:57:29,295.295 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.bias loaded from bert.encoder.blocks.8.mlp.fc2.bias of shape (768,) 2022-03-17 13:57:29,295.295 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.weight loaded from bert.encoder.blocks.8.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:57:29,295.295 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.bias loaded from bert.encoder.blocks.8.norm1.bias of shape (768,) 2022-03-17 13:57:29,295.295 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.weight loaded from bert.encoder.blocks.8.norm1.weight of shape (768,) 2022-03-17 13:57:29,295.295 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.bias loaded from bert.encoder.blocks.8.norm2.bias of shape (768,) 2022-03-17 13:57:29,295.295 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.weight loaded from bert.encoder.blocks.8.norm2.weight of shape (768,) 2022-03-17 13:57:29,295.295 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.bias loaded from bert.encoder.blocks.9.attn.proj.bias of shape (768,) 2022-03-17 13:57:29,295.295 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.weight loaded from bert.encoder.blocks.9.attn.proj.weight of shape (768, 768) 2022-03-17 13:57:29,295.295 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.bias loaded from bert.encoder.blocks.9.attn.qkv.bias of shape (2304,) 2022-03-17 13:57:29,295.295 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.weight loaded from bert.encoder.blocks.9.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:57:29,295.295 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.bias loaded from bert.encoder.blocks.9.mlp.fc1.bias of shape (3072,) 2022-03-17 13:57:29,295.295 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.weight loaded from bert.encoder.blocks.9.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:57:29,296.296 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.bias loaded from bert.encoder.blocks.9.mlp.fc2.bias of shape (768,) 2022-03-17 13:57:29,296.296 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.weight loaded from bert.encoder.blocks.9.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:57:29,296.296 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.bias loaded from bert.encoder.blocks.9.norm1.bias of shape (768,) 2022-03-17 13:57:29,296.296 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.weight loaded from bert.encoder.blocks.9.norm1.weight of shape (768,) 2022-03-17 13:57:29,296.296 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.bias loaded from bert.encoder.blocks.9.norm2.bias of shape (768,) 2022-03-17 13:57:29,296.296 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.weight loaded from bert.encoder.blocks.9.norm2.weight of shape (768,) 2022-03-17 13:57:29,296.296 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.bias loaded from bert.encoder.tag_blocks.0.attn.proj.bias of shape (768,) 2022-03-17 13:57:29,296.296 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.weight loaded from bert.encoder.tag_blocks.0.attn.proj.weight of shape (768, 768) 2022-03-17 13:57:29,296.296 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.bias loaded from bert.encoder.tag_blocks.0.attn.qkv.bias of shape (2304,) 2022-03-17 13:57:29,296.296 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.weight loaded from bert.encoder.tag_blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:57:29,296.296 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.bias loaded from bert.encoder.tag_blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-17 13:57:29,296.296 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.weight loaded from bert.encoder.tag_blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:57:29,296.296 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.bias loaded from bert.encoder.tag_blocks.0.mlp.fc2.bias of shape (768,) 2022-03-17 13:57:29,296.296 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.weight loaded from bert.encoder.tag_blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:57:29,297.297 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.bias loaded from bert.encoder.tag_blocks.0.norm1.bias of shape (768,) 2022-03-17 13:57:29,297.297 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.weight loaded from bert.encoder.tag_blocks.0.norm1.weight of shape (768,) 2022-03-17 13:57:29,297.297 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.bias loaded from bert.encoder.tag_blocks.0.norm2.bias of shape (768,) 2022-03-17 13:57:29,297.297 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.weight loaded from bert.encoder.tag_blocks.0.norm2.weight of shape (768,) 2022-03-17 13:57:29,297.297 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.bias loaded from bert.encoder.tag_blocks.1.attn.proj.bias of shape (768,) 2022-03-17 13:57:29,297.297 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.weight loaded from bert.encoder.tag_blocks.1.attn.proj.weight of shape (768, 768) 2022-03-17 13:57:29,297.297 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.bias loaded from bert.encoder.tag_blocks.1.attn.qkv.bias of shape (2304,) 2022-03-17 13:57:29,297.297 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.weight loaded from bert.encoder.tag_blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:57:29,297.297 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.bias loaded from bert.encoder.tag_blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-17 13:57:29,297.297 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.weight loaded from bert.encoder.tag_blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:57:29,297.297 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.bias loaded from bert.encoder.tag_blocks.1.mlp.fc2.bias of shape (768,) 2022-03-17 13:57:29,297.297 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.weight loaded from bert.encoder.tag_blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:57:29,297.297 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.bias loaded from bert.encoder.tag_blocks.1.norm1.bias of shape (768,) 2022-03-17 13:57:29,297.297 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.weight loaded from bert.encoder.tag_blocks.1.norm1.weight of shape (768,) 2022-03-17 13:57:29,298.298 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.bias loaded from bert.encoder.tag_blocks.1.norm2.bias of shape (768,) 2022-03-17 13:57:29,298.298 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.weight loaded from bert.encoder.tag_blocks.1.norm2.weight of shape (768,) 2022-03-17 13:57:29,298.298 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.bias loaded from bert.encoder.tag_blocks.2.attn.proj.bias of shape (768,) 2022-03-17 13:57:29,298.298 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.weight loaded from bert.encoder.tag_blocks.2.attn.proj.weight of shape (768, 768) 2022-03-17 13:57:29,298.298 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.bias loaded from bert.encoder.tag_blocks.2.attn.qkv.bias of shape (2304,) 2022-03-17 13:57:29,298.298 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.weight loaded from bert.encoder.tag_blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:57:29,298.298 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.bias loaded from bert.encoder.tag_blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-17 13:57:29,298.298 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.weight loaded from bert.encoder.tag_blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:57:29,298.298 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.bias loaded from bert.encoder.tag_blocks.2.mlp.fc2.bias of shape (768,) 2022-03-17 13:57:29,298.298 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.weight loaded from bert.encoder.tag_blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:57:29,298.298 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.bias loaded from bert.encoder.tag_blocks.2.norm1.bias of shape (768,) 2022-03-17 13:57:29,298.298 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.weight loaded from bert.encoder.tag_blocks.2.norm1.weight of shape (768,) 2022-03-17 13:57:29,298.298 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.bias loaded from bert.encoder.tag_blocks.2.norm2.bias of shape (768,) 2022-03-17 13:57:29,299.299 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.weight loaded from bert.encoder.tag_blocks.2.norm2.weight of shape (768,) 2022-03-17 13:57:29,299.299 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.bias loaded from bert.encoder.tag_blocks.3.attn.proj.bias of shape (768,) 2022-03-17 13:57:29,299.299 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.weight loaded from bert.encoder.tag_blocks.3.attn.proj.weight of shape (768, 768) 2022-03-17 13:57:29,299.299 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.bias loaded from bert.encoder.tag_blocks.3.attn.qkv.bias of shape (2304,) 2022-03-17 13:57:29,299.299 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.weight loaded from bert.encoder.tag_blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:57:29,299.299 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.bias loaded from bert.encoder.tag_blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-17 13:57:29,299.299 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.weight loaded from bert.encoder.tag_blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:57:29,299.299 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.bias loaded from bert.encoder.tag_blocks.3.mlp.fc2.bias of shape (768,) 2022-03-17 13:57:29,299.299 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.weight loaded from bert.encoder.tag_blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:57:29,299.299 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.bias loaded from bert.encoder.tag_blocks.3.norm1.bias of shape (768,) 2022-03-17 13:57:29,299.299 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.weight loaded from bert.encoder.tag_blocks.3.norm1.weight of shape (768,) 2022-03-17 13:57:29,299.299 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.bias loaded from bert.encoder.tag_blocks.3.norm2.bias of shape (768,) 2022-03-17 13:57:29,299.299 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.weight loaded from bert.encoder.tag_blocks.3.norm2.weight of shape (768,) 2022-03-17 13:57:29,300.300 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.bias loaded from bert.extra_embeddings.LayerNorm.bias of shape (768,) 2022-03-17 13:57:29,300.300 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.weight loaded from bert.extra_embeddings.LayerNorm.weight of shape (768,) 2022-03-17 13:57:29,300.300 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.position_embeddings.weight loaded from bert.extra_embeddings.position_embeddings.weight of shape (512, 768) 2022-03-17 13:57:29,300.300 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.token_type_embeddings.weight loaded from bert.extra_embeddings.token_type_embeddings.weight of shape (2, 768) 2022-03-17 13:57:29,300.300 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.word_embeddings.weight loaded from bert.extra_embeddings.word_embeddings.weight of shape (30522, 768) 2022-03-17 13:57:29,300.300 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.bias loaded from bert.pooler.dense.bias of shape (768,) 2022-03-17 13:57:29,300.300 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.weight loaded from bert.pooler.dense.weight of shape (768, 768) 2022-03-17 13:57:29,300.300 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.bias loaded from bert.tag_logit.predictions.bias of shape (30522,) 2022-03-17 13:57:29,300.300 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.decoder.weight loaded from bert.tag_logit.predictions.decoder.weight of shape (30522, 768) 2022-03-17 13:57:29,300.300 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.bias loaded from bert.tag_logit.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-17 13:57:29,300.300 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.weight loaded from bert.tag_logit.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-17 13:57:29,300.300 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.bias loaded from bert.tag_logit.predictions.transform.dense.bias of shape (768,) 2022-03-17 13:57:29,301.301 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.weight loaded from bert.tag_logit.predictions.transform.dense.weight of shape (768, 768) 2022-03-17 13:57:29,301.301 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.bias loaded from cls.predictions.bias of shape (30522,) 2022-03-17 13:57:29,301.301 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.decoder.weight loaded from cls.predictions.decoder.weight of shape (30522, 768) 2022-03-17 13:57:29,301.301 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.bias loaded from cls.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-17 13:57:29,301.301 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.weight loaded from cls.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-17 13:57:29,301.301 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.bias loaded from cls.predictions.transform.dense.bias of shape (768,) 2022-03-17 13:57:29,301.301 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.weight loaded from cls.predictions.transform.dense.weight of shape (768, 768) 2022-03-17 13:57:29,301.301 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 288; loaded = 288 2022-03-17 13:57:29,301.301 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = [] 2022-03-17 13:57:29,304.304 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = ['module.cls.predictions.decoder.weight']; total = 1 2022-03-17 13:57:29,464.464 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.cls.predictions.decoder.weight'] 2022-03-17 13:57:29,788.788 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = ; data = TaxCocoCaption; split = test; add_key = False; backend = cv; hold_buffer = 0; save_original = False 2022-03-17 13:57:29,815.815 2829:uni_pipeline.py:509 get_data_loader(): sampler = 2022-03-17 13:57:29,815.815 2829:uni_pipeline.py:1000 predict(): writing output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0050000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv 2022-03-17 13:57:29,887.887 2829:uni_pipeline.py:907 predict_iter(): DatasetPlusTransform(dataset=, transform=Compose( Compose( ImageTransform2Dict(image_transform=Compose( ToPILImage() Resize(size=384, interpolation=PIL.Image.BICUBIC) CenterCrop(size=(384, 384)) ToTensor() Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) )) ) LoadLabel(data=TaxCocoCaption, split=test, version=vinvl) TransCaptionTensorizer(tensorizer=, pad_to_max=True, pad_image_to_max=True) )) uni_pipeline.py:908: 0%| | 0/10 [00:00 locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi pip.monitor_train() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train need_wait_models = self.pred_eval_intermediate_models() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models pred = self.ensure_predict(model_file=model_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict self.predict(model_file, predict_result_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict model = self.get_model(is_train=False) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. --- Logging error --- Traceback (most recent call last): File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit msg = self.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format return fmt.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format record.message = record.getMessage() File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage msg = str(self.msg) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__ return str(self.to_json_string()) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n" File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps **kw).encode(obj) File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode chunks = list(chunks) File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode yield from _iterencode_dict(o, _current_indent_level) File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict yield from chunks File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode o = _default(o) File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default raise TypeError(f'Object of type {o.__class__.__name__} ' TypeError: Object of type BertTokenizer is not JSON serializable Call stack: File "src/qd/pipeline.py", line 1368, in locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi pip.monitor_train() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train need_wait_models = self.pred_eval_intermediate_models() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models pred = self.ensure_predict(model_file=model_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict self.predict(model_file, predict_result_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict model = self.get_model(is_train=False) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. 2022-03-17 13:59:43,919.919 2829:modeling_bert.py:529 __init__(): TIMM Split image encoder load from pre-trained: True 2022-03-17 13:59:44,988.988 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:59:46,478.478 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:59:47,096.096 2829:modeling_bert.py:2677 __init__(): BertImgModel Image Dimension: 2054 2022-03-17 13:59:47,244.244 2829:modeling_bert.py:2688 __init__(): BertImgModel Image Dimension: 2054 2022-03-17 13:59:48,014.014 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight! Note that this might be replaced by pre-trained checkpoint later! 2022-03-17 13:59:48,014.014 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode. 2022-03-17 13:59:49,109.109 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 13:59:49,673.673 2829:checkpoint.py:240 load(): Loading checkpoint from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt 2022-03-17 13:59:55,915.915 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.cls_token loaded from image_encoder.module.cls_token of shape (1, 1, 768) 2022-03-17 13:59:55,915.915 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.bias loaded from image_encoder.module.head.bias of shape (1000,) 2022-03-17 13:59:55,915.915 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.weight loaded from image_encoder.module.head.weight of shape (1000, 768) 2022-03-17 13:59:55,916.916 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.bias loaded from image_encoder.module.patch_embed.proj.bias of shape (768,) 2022-03-17 13:59:55,916.916 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.weight loaded from image_encoder.module.patch_embed.proj.weight of shape (768, 3, 16, 16) 2022-03-17 13:59:55,916.916 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.pos_embed loaded from image_encoder.module.pos_embed of shape (1, 577, 768) 2022-03-17 13:59:55,916.916 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.bias loaded from bert.caption_pooler.dense.bias of shape (768,) 2022-03-17 13:59:55,916.916 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.weight loaded from bert.caption_pooler.dense.weight of shape (768, 768) 2022-03-17 13:59:55,916.916 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.bias loaded from bert.decoder.layer.0.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:59:55,916.916 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.weight loaded from bert.decoder.layer.0.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:59:55,916.916 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.bias loaded from bert.decoder.layer.0.attention.output.dense.bias of shape (768,) 2022-03-17 13:59:55,916.916 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.weight loaded from bert.decoder.layer.0.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:59:55,916.916 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.bias loaded from bert.decoder.layer.0.attention.self.key.bias of shape (768,) 2022-03-17 13:59:55,916.916 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.weight loaded from bert.decoder.layer.0.attention.self.key.weight of shape (768, 768) 2022-03-17 13:59:55,916.916 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.bias loaded from bert.decoder.layer.0.attention.self.query.bias of shape (768,) 2022-03-17 13:59:55,917.917 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.weight loaded from bert.decoder.layer.0.attention.self.query.weight of shape (768, 768) 2022-03-17 13:59:55,917.917 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.bias loaded from bert.decoder.layer.0.attention.self.value.bias of shape (768,) 2022-03-17 13:59:55,917.917 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.weight loaded from bert.decoder.layer.0.attention.self.value.weight of shape (768, 768) 2022-03-17 13:59:55,917.917 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.bias loaded from bert.decoder.layer.0.intermediate.dense.bias of shape (3072,) 2022-03-17 13:59:55,917.917 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.weight loaded from bert.decoder.layer.0.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:59:55,917.917 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.bias loaded from bert.decoder.layer.0.output.LayerNorm.bias of shape (768,) 2022-03-17 13:59:55,917.917 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.weight loaded from bert.decoder.layer.0.output.LayerNorm.weight of shape (768,) 2022-03-17 13:59:55,917.917 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.bias loaded from bert.decoder.layer.0.output.dense.bias of shape (768,) 2022-03-17 13:59:55,917.917 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.weight loaded from bert.decoder.layer.0.output.dense.weight of shape (768, 3072) 2022-03-17 13:59:55,917.917 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.bias loaded from bert.decoder.layer.1.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:59:55,917.917 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.weight loaded from bert.decoder.layer.1.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:59:55,917.917 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.bias loaded from bert.decoder.layer.1.attention.output.dense.bias of shape (768,) 2022-03-17 13:59:55,917.917 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.weight loaded from bert.decoder.layer.1.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:59:55,918.918 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.bias loaded from bert.decoder.layer.1.attention.self.key.bias of shape (768,) 2022-03-17 13:59:55,918.918 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.weight loaded from bert.decoder.layer.1.attention.self.key.weight of shape (768, 768) 2022-03-17 13:59:55,918.918 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.bias loaded from bert.decoder.layer.1.attention.self.query.bias of shape (768,) 2022-03-17 13:59:55,918.918 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.weight loaded from bert.decoder.layer.1.attention.self.query.weight of shape (768, 768) 2022-03-17 13:59:55,918.918 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.bias loaded from bert.decoder.layer.1.attention.self.value.bias of shape (768,) 2022-03-17 13:59:55,918.918 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.weight loaded from bert.decoder.layer.1.attention.self.value.weight of shape (768, 768) 2022-03-17 13:59:55,918.918 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.bias loaded from bert.decoder.layer.1.intermediate.dense.bias of shape (3072,) 2022-03-17 13:59:55,918.918 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.weight loaded from bert.decoder.layer.1.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:59:55,918.918 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.bias loaded from bert.decoder.layer.1.output.LayerNorm.bias of shape (768,) 2022-03-17 13:59:55,918.918 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.weight loaded from bert.decoder.layer.1.output.LayerNorm.weight of shape (768,) 2022-03-17 13:59:55,918.918 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.bias loaded from bert.decoder.layer.1.output.dense.bias of shape (768,) 2022-03-17 13:59:55,918.918 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.weight loaded from bert.decoder.layer.1.output.dense.weight of shape (768, 3072) 2022-03-17 13:59:55,918.918 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.bias loaded from bert.decoder.layer.2.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:59:55,919.919 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.weight loaded from bert.decoder.layer.2.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:59:55,919.919 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.bias loaded from bert.decoder.layer.2.attention.output.dense.bias of shape (768,) 2022-03-17 13:59:55,919.919 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.weight loaded from bert.decoder.layer.2.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:59:55,919.919 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.bias loaded from bert.decoder.layer.2.attention.self.key.bias of shape (768,) 2022-03-17 13:59:55,919.919 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.weight loaded from bert.decoder.layer.2.attention.self.key.weight of shape (768, 768) 2022-03-17 13:59:55,919.919 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.bias loaded from bert.decoder.layer.2.attention.self.query.bias of shape (768,) 2022-03-17 13:59:55,919.919 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.weight loaded from bert.decoder.layer.2.attention.self.query.weight of shape (768, 768) 2022-03-17 13:59:55,919.919 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.bias loaded from bert.decoder.layer.2.attention.self.value.bias of shape (768,) 2022-03-17 13:59:55,919.919 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.weight loaded from bert.decoder.layer.2.attention.self.value.weight of shape (768, 768) 2022-03-17 13:59:55,919.919 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.bias loaded from bert.decoder.layer.2.intermediate.dense.bias of shape (3072,) 2022-03-17 13:59:55,919.919 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.weight loaded from bert.decoder.layer.2.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:59:55,919.919 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.bias loaded from bert.decoder.layer.2.output.LayerNorm.bias of shape (768,) 2022-03-17 13:59:55,919.919 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.weight loaded from bert.decoder.layer.2.output.LayerNorm.weight of shape (768,) 2022-03-17 13:59:55,919.919 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.bias loaded from bert.decoder.layer.2.output.dense.bias of shape (768,) 2022-03-17 13:59:55,920.920 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.weight loaded from bert.decoder.layer.2.output.dense.weight of shape (768, 3072) 2022-03-17 13:59:55,920.920 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.bias loaded from bert.decoder.layer.3.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 13:59:55,920.920 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.weight loaded from bert.decoder.layer.3.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 13:59:55,920.920 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.bias loaded from bert.decoder.layer.3.attention.output.dense.bias of shape (768,) 2022-03-17 13:59:55,920.920 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.weight loaded from bert.decoder.layer.3.attention.output.dense.weight of shape (768, 768) 2022-03-17 13:59:55,920.920 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.bias loaded from bert.decoder.layer.3.attention.self.key.bias of shape (768,) 2022-03-17 13:59:55,920.920 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.weight loaded from bert.decoder.layer.3.attention.self.key.weight of shape (768, 768) 2022-03-17 13:59:55,920.920 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.bias loaded from bert.decoder.layer.3.attention.self.query.bias of shape (768,) 2022-03-17 13:59:55,920.920 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.weight loaded from bert.decoder.layer.3.attention.self.query.weight of shape (768, 768) 2022-03-17 13:59:55,920.920 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.bias loaded from bert.decoder.layer.3.attention.self.value.bias of shape (768,) 2022-03-17 13:59:55,920.920 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.weight loaded from bert.decoder.layer.3.attention.self.value.weight of shape (768, 768) 2022-03-17 13:59:55,920.920 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.bias loaded from bert.decoder.layer.3.intermediate.dense.bias of shape (3072,) 2022-03-17 13:59:55,920.920 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.weight loaded from bert.decoder.layer.3.intermediate.dense.weight of shape (3072, 768) 2022-03-17 13:59:55,921.921 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.bias loaded from bert.decoder.layer.3.output.LayerNorm.bias of shape (768,) 2022-03-17 13:59:55,921.921 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.weight loaded from bert.decoder.layer.3.output.LayerNorm.weight of shape (768,) 2022-03-17 13:59:55,921.921 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.bias loaded from bert.decoder.layer.3.output.dense.bias of shape (768,) 2022-03-17 13:59:55,921.921 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.weight loaded from bert.decoder.layer.3.output.dense.weight of shape (768, 3072) 2022-03-17 13:59:55,921.921 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.bias loaded from bert.embeddings.LayerNorm.bias of shape (768,) 2022-03-17 13:59:55,921.921 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.weight loaded from bert.embeddings.LayerNorm.weight of shape (768,) 2022-03-17 13:59:55,921.921 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.position_embeddings.weight loaded from bert.embeddings.position_embeddings.weight of shape (512, 768) 2022-03-17 13:59:55,921.921 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.token_type_embeddings.weight loaded from bert.embeddings.token_type_embeddings.weight of shape (2, 768) 2022-03-17 13:59:55,921.921 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.word_embeddings.weight loaded from bert.embeddings.word_embeddings.weight of shape (30522, 768) 2022-03-17 13:59:55,921.921 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.bias loaded from bert.encoder.blocks.0.attn.proj.bias of shape (768,) 2022-03-17 13:59:55,921.921 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.weight loaded from bert.encoder.blocks.0.attn.proj.weight of shape (768, 768) 2022-03-17 13:59:55,921.921 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.bias loaded from bert.encoder.blocks.0.attn.qkv.bias of shape (2304,) 2022-03-17 13:59:55,921.921 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.weight loaded from bert.encoder.blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:59:55,922.922 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.bias loaded from bert.encoder.blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-17 13:59:55,922.922 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.weight loaded from bert.encoder.blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:59:55,922.922 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.bias loaded from bert.encoder.blocks.0.mlp.fc2.bias of shape (768,) 2022-03-17 13:59:55,922.922 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.weight loaded from bert.encoder.blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:59:55,922.922 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.bias loaded from bert.encoder.blocks.0.norm1.bias of shape (768,) 2022-03-17 13:59:55,922.922 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.weight loaded from bert.encoder.blocks.0.norm1.weight of shape (768,) 2022-03-17 13:59:55,922.922 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.bias loaded from bert.encoder.blocks.0.norm2.bias of shape (768,) 2022-03-17 13:59:55,922.922 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.weight loaded from bert.encoder.blocks.0.norm2.weight of shape (768,) 2022-03-17 13:59:55,922.922 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.bias loaded from bert.encoder.blocks.1.attn.proj.bias of shape (768,) 2022-03-17 13:59:55,922.922 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.weight loaded from bert.encoder.blocks.1.attn.proj.weight of shape (768, 768) 2022-03-17 13:59:55,922.922 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.bias loaded from bert.encoder.blocks.1.attn.qkv.bias of shape (2304,) 2022-03-17 13:59:55,922.922 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.weight loaded from bert.encoder.blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:59:55,922.922 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.bias loaded from bert.encoder.blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-17 13:59:55,923.923 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.weight loaded from bert.encoder.blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:59:55,923.923 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.bias loaded from bert.encoder.blocks.1.mlp.fc2.bias of shape (768,) 2022-03-17 13:59:55,923.923 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.weight loaded from bert.encoder.blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:59:55,923.923 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.bias loaded from bert.encoder.blocks.1.norm1.bias of shape (768,) 2022-03-17 13:59:55,923.923 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.weight loaded from bert.encoder.blocks.1.norm1.weight of shape (768,) 2022-03-17 13:59:55,923.923 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.bias loaded from bert.encoder.blocks.1.norm2.bias of shape (768,) 2022-03-17 13:59:55,923.923 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.weight loaded from bert.encoder.blocks.1.norm2.weight of shape (768,) 2022-03-17 13:59:55,923.923 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.bias loaded from bert.encoder.blocks.10.attn.proj.bias of shape (768,) 2022-03-17 13:59:55,923.923 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.weight loaded from bert.encoder.blocks.10.attn.proj.weight of shape (768, 768) 2022-03-17 13:59:55,923.923 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.bias loaded from bert.encoder.blocks.10.attn.qkv.bias of shape (2304,) 2022-03-17 13:59:55,923.923 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.weight loaded from bert.encoder.blocks.10.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:59:55,923.923 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.bias loaded from bert.encoder.blocks.10.mlp.fc1.bias of shape (3072,) 2022-03-17 13:59:55,923.923 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.weight loaded from bert.encoder.blocks.10.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:59:55,923.923 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.bias loaded from bert.encoder.blocks.10.mlp.fc2.bias of shape (768,) 2022-03-17 13:59:55,924.924 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.weight loaded from bert.encoder.blocks.10.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:59:55,924.924 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.bias loaded from bert.encoder.blocks.10.norm1.bias of shape (768,) 2022-03-17 13:59:55,924.924 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.weight loaded from bert.encoder.blocks.10.norm1.weight of shape (768,) 2022-03-17 13:59:55,924.924 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.bias loaded from bert.encoder.blocks.10.norm2.bias of shape (768,) 2022-03-17 13:59:55,924.924 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.weight loaded from bert.encoder.blocks.10.norm2.weight of shape (768,) 2022-03-17 13:59:55,924.924 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.bias loaded from bert.encoder.blocks.11.attn.proj.bias of shape (768,) 2022-03-17 13:59:55,924.924 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.weight loaded from bert.encoder.blocks.11.attn.proj.weight of shape (768, 768) 2022-03-17 13:59:55,924.924 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.bias loaded from bert.encoder.blocks.11.attn.qkv.bias of shape (2304,) 2022-03-17 13:59:55,924.924 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.weight loaded from bert.encoder.blocks.11.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:59:55,924.924 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.bias loaded from bert.encoder.blocks.11.mlp.fc1.bias of shape (3072,) 2022-03-17 13:59:55,924.924 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.weight loaded from bert.encoder.blocks.11.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:59:55,924.924 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.bias loaded from bert.encoder.blocks.11.mlp.fc2.bias of shape (768,) 2022-03-17 13:59:55,924.924 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.weight loaded from bert.encoder.blocks.11.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:59:55,925.925 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.bias loaded from bert.encoder.blocks.11.norm1.bias of shape (768,) 2022-03-17 13:59:55,925.925 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.weight loaded from bert.encoder.blocks.11.norm1.weight of shape (768,) 2022-03-17 13:59:55,925.925 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.bias loaded from bert.encoder.blocks.11.norm2.bias of shape (768,) 2022-03-17 13:59:55,925.925 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.weight loaded from bert.encoder.blocks.11.norm2.weight of shape (768,) 2022-03-17 13:59:55,925.925 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.bias loaded from bert.encoder.blocks.2.attn.proj.bias of shape (768,) 2022-03-17 13:59:55,925.925 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.weight loaded from bert.encoder.blocks.2.attn.proj.weight of shape (768, 768) 2022-03-17 13:59:55,925.925 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.bias loaded from bert.encoder.blocks.2.attn.qkv.bias of shape (2304,) 2022-03-17 13:59:55,925.925 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.weight loaded from bert.encoder.blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:59:55,925.925 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.bias loaded from bert.encoder.blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-17 13:59:55,925.925 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.weight loaded from bert.encoder.blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:59:55,925.925 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.bias loaded from bert.encoder.blocks.2.mlp.fc2.bias of shape (768,) 2022-03-17 13:59:55,925.925 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.weight loaded from bert.encoder.blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:59:55,925.925 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.bias loaded from bert.encoder.blocks.2.norm1.bias of shape (768,) 2022-03-17 13:59:55,926.926 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.weight loaded from bert.encoder.blocks.2.norm1.weight of shape (768,) 2022-03-17 13:59:55,926.926 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.bias loaded from bert.encoder.blocks.2.norm2.bias of shape (768,) 2022-03-17 13:59:55,926.926 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.weight loaded from bert.encoder.blocks.2.norm2.weight of shape (768,) 2022-03-17 13:59:55,926.926 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.bias loaded from bert.encoder.blocks.3.attn.proj.bias of shape (768,) 2022-03-17 13:59:55,926.926 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.weight loaded from bert.encoder.blocks.3.attn.proj.weight of shape (768, 768) 2022-03-17 13:59:55,926.926 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.bias loaded from bert.encoder.blocks.3.attn.qkv.bias of shape (2304,) 2022-03-17 13:59:55,926.926 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.weight loaded from bert.encoder.blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:59:55,926.926 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.bias loaded from bert.encoder.blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-17 13:59:55,926.926 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.weight loaded from bert.encoder.blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:59:55,926.926 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.bias loaded from bert.encoder.blocks.3.mlp.fc2.bias of shape (768,) 2022-03-17 13:59:55,926.926 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.weight loaded from bert.encoder.blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:59:55,926.926 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.bias loaded from bert.encoder.blocks.3.norm1.bias of shape (768,) 2022-03-17 13:59:55,926.926 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.weight loaded from bert.encoder.blocks.3.norm1.weight of shape (768,) 2022-03-17 13:59:55,926.926 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.bias loaded from bert.encoder.blocks.3.norm2.bias of shape (768,) 2022-03-17 13:59:55,927.927 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.weight loaded from bert.encoder.blocks.3.norm2.weight of shape (768,) 2022-03-17 13:59:55,927.927 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.bias loaded from bert.encoder.blocks.4.attn.proj.bias of shape (768,) 2022-03-17 13:59:55,927.927 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.weight loaded from bert.encoder.blocks.4.attn.proj.weight of shape (768, 768) 2022-03-17 13:59:55,927.927 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.bias loaded from bert.encoder.blocks.4.attn.qkv.bias of shape (2304,) 2022-03-17 13:59:55,927.927 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.weight loaded from bert.encoder.blocks.4.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:59:55,927.927 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.bias loaded from bert.encoder.blocks.4.mlp.fc1.bias of shape (3072,) 2022-03-17 13:59:55,927.927 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.weight loaded from bert.encoder.blocks.4.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:59:55,927.927 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.bias loaded from bert.encoder.blocks.4.mlp.fc2.bias of shape (768,) 2022-03-17 13:59:55,927.927 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.weight loaded from bert.encoder.blocks.4.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:59:55,927.927 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.bias loaded from bert.encoder.blocks.4.norm1.bias of shape (768,) 2022-03-17 13:59:55,927.927 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.weight loaded from bert.encoder.blocks.4.norm1.weight of shape (768,) 2022-03-17 13:59:55,927.927 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.bias loaded from bert.encoder.blocks.4.norm2.bias of shape (768,) 2022-03-17 13:59:55,927.927 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.weight loaded from bert.encoder.blocks.4.norm2.weight of shape (768,) 2022-03-17 13:59:55,928.928 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.bias loaded from bert.encoder.blocks.5.attn.proj.bias of shape (768,) 2022-03-17 13:59:55,928.928 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.weight loaded from bert.encoder.blocks.5.attn.proj.weight of shape (768, 768) 2022-03-17 13:59:55,928.928 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.bias loaded from bert.encoder.blocks.5.attn.qkv.bias of shape (2304,) 2022-03-17 13:59:55,928.928 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.weight loaded from bert.encoder.blocks.5.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:59:55,928.928 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.bias loaded from bert.encoder.blocks.5.mlp.fc1.bias of shape (3072,) 2022-03-17 13:59:55,928.928 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.weight loaded from bert.encoder.blocks.5.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:59:55,928.928 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.bias loaded from bert.encoder.blocks.5.mlp.fc2.bias of shape (768,) 2022-03-17 13:59:55,928.928 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.weight loaded from bert.encoder.blocks.5.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:59:55,928.928 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.bias loaded from bert.encoder.blocks.5.norm1.bias of shape (768,) 2022-03-17 13:59:55,928.928 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.weight loaded from bert.encoder.blocks.5.norm1.weight of shape (768,) 2022-03-17 13:59:55,928.928 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.bias loaded from bert.encoder.blocks.5.norm2.bias of shape (768,) 2022-03-17 13:59:55,928.928 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.weight loaded from bert.encoder.blocks.5.norm2.weight of shape (768,) 2022-03-17 13:59:55,928.928 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.bias loaded from bert.encoder.blocks.6.attn.proj.bias of shape (768,) 2022-03-17 13:59:55,928.928 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.weight loaded from bert.encoder.blocks.6.attn.proj.weight of shape (768, 768) 2022-03-17 13:59:55,929.929 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.bias loaded from bert.encoder.blocks.6.attn.qkv.bias of shape (2304,) 2022-03-17 13:59:55,929.929 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.weight loaded from bert.encoder.blocks.6.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:59:55,929.929 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.bias loaded from bert.encoder.blocks.6.mlp.fc1.bias of shape (3072,) 2022-03-17 13:59:55,929.929 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.weight loaded from bert.encoder.blocks.6.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:59:55,929.929 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.bias loaded from bert.encoder.blocks.6.mlp.fc2.bias of shape (768,) 2022-03-17 13:59:55,929.929 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.weight loaded from bert.encoder.blocks.6.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:59:55,929.929 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.bias loaded from bert.encoder.blocks.6.norm1.bias of shape (768,) 2022-03-17 13:59:55,929.929 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.weight loaded from bert.encoder.blocks.6.norm1.weight of shape (768,) 2022-03-17 13:59:55,929.929 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.bias loaded from bert.encoder.blocks.6.norm2.bias of shape (768,) 2022-03-17 13:59:55,929.929 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.weight loaded from bert.encoder.blocks.6.norm2.weight of shape (768,) 2022-03-17 13:59:55,929.929 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.bias loaded from bert.encoder.blocks.7.attn.proj.bias of shape (768,) 2022-03-17 13:59:55,929.929 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.weight loaded from bert.encoder.blocks.7.attn.proj.weight of shape (768, 768) 2022-03-17 13:59:55,929.929 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.bias loaded from bert.encoder.blocks.7.attn.qkv.bias of shape (2304,) 2022-03-17 13:59:55,930.930 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.weight loaded from bert.encoder.blocks.7.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:59:55,930.930 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.bias loaded from bert.encoder.blocks.7.mlp.fc1.bias of shape (3072,) 2022-03-17 13:59:55,930.930 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.weight loaded from bert.encoder.blocks.7.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:59:55,930.930 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.bias loaded from bert.encoder.blocks.7.mlp.fc2.bias of shape (768,) 2022-03-17 13:59:55,930.930 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.weight loaded from bert.encoder.blocks.7.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:59:55,930.930 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.bias loaded from bert.encoder.blocks.7.norm1.bias of shape (768,) 2022-03-17 13:59:55,930.930 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.weight loaded from bert.encoder.blocks.7.norm1.weight of shape (768,) 2022-03-17 13:59:55,930.930 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.bias loaded from bert.encoder.blocks.7.norm2.bias of shape (768,) 2022-03-17 13:59:55,930.930 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.weight loaded from bert.encoder.blocks.7.norm2.weight of shape (768,) 2022-03-17 13:59:55,930.930 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.bias loaded from bert.encoder.blocks.8.attn.proj.bias of shape (768,) 2022-03-17 13:59:55,930.930 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.weight loaded from bert.encoder.blocks.8.attn.proj.weight of shape (768, 768) 2022-03-17 13:59:55,930.930 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.bias loaded from bert.encoder.blocks.8.attn.qkv.bias of shape (2304,) 2022-03-17 13:59:55,930.930 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.weight loaded from bert.encoder.blocks.8.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:59:55,930.930 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.bias loaded from bert.encoder.blocks.8.mlp.fc1.bias of shape (3072,) 2022-03-17 13:59:55,931.931 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.weight loaded from bert.encoder.blocks.8.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:59:55,931.931 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.bias loaded from bert.encoder.blocks.8.mlp.fc2.bias of shape (768,) 2022-03-17 13:59:55,931.931 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.weight loaded from bert.encoder.blocks.8.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:59:55,931.931 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.bias loaded from bert.encoder.blocks.8.norm1.bias of shape (768,) 2022-03-17 13:59:55,931.931 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.weight loaded from bert.encoder.blocks.8.norm1.weight of shape (768,) 2022-03-17 13:59:55,931.931 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.bias loaded from bert.encoder.blocks.8.norm2.bias of shape (768,) 2022-03-17 13:59:55,931.931 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.weight loaded from bert.encoder.blocks.8.norm2.weight of shape (768,) 2022-03-17 13:59:55,931.931 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.bias loaded from bert.encoder.blocks.9.attn.proj.bias of shape (768,) 2022-03-17 13:59:55,931.931 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.weight loaded from bert.encoder.blocks.9.attn.proj.weight of shape (768, 768) 2022-03-17 13:59:55,931.931 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.bias loaded from bert.encoder.blocks.9.attn.qkv.bias of shape (2304,) 2022-03-17 13:59:55,931.931 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.weight loaded from bert.encoder.blocks.9.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:59:55,931.931 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.bias loaded from bert.encoder.blocks.9.mlp.fc1.bias of shape (3072,) 2022-03-17 13:59:55,931.931 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.weight loaded from bert.encoder.blocks.9.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:59:55,932.932 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.bias loaded from bert.encoder.blocks.9.mlp.fc2.bias of shape (768,) 2022-03-17 13:59:55,932.932 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.weight loaded from bert.encoder.blocks.9.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:59:55,932.932 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.bias loaded from bert.encoder.blocks.9.norm1.bias of shape (768,) 2022-03-17 13:59:55,932.932 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.weight loaded from bert.encoder.blocks.9.norm1.weight of shape (768,) 2022-03-17 13:59:55,932.932 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.bias loaded from bert.encoder.blocks.9.norm2.bias of shape (768,) 2022-03-17 13:59:55,932.932 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.weight loaded from bert.encoder.blocks.9.norm2.weight of shape (768,) 2022-03-17 13:59:55,932.932 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.bias loaded from bert.encoder.tag_blocks.0.attn.proj.bias of shape (768,) 2022-03-17 13:59:55,932.932 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.weight loaded from bert.encoder.tag_blocks.0.attn.proj.weight of shape (768, 768) 2022-03-17 13:59:55,932.932 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.bias loaded from bert.encoder.tag_blocks.0.attn.qkv.bias of shape (2304,) 2022-03-17 13:59:55,932.932 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.weight loaded from bert.encoder.tag_blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:59:55,932.932 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.bias loaded from bert.encoder.tag_blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-17 13:59:55,932.932 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.weight loaded from bert.encoder.tag_blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:59:55,932.932 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.bias loaded from bert.encoder.tag_blocks.0.mlp.fc2.bias of shape (768,) 2022-03-17 13:59:55,932.932 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.weight loaded from bert.encoder.tag_blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:59:55,933.933 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.bias loaded from bert.encoder.tag_blocks.0.norm1.bias of shape (768,) 2022-03-17 13:59:55,933.933 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.weight loaded from bert.encoder.tag_blocks.0.norm1.weight of shape (768,) 2022-03-17 13:59:55,933.933 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.bias loaded from bert.encoder.tag_blocks.0.norm2.bias of shape (768,) 2022-03-17 13:59:55,933.933 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.weight loaded from bert.encoder.tag_blocks.0.norm2.weight of shape (768,) 2022-03-17 13:59:55,933.933 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.bias loaded from bert.encoder.tag_blocks.1.attn.proj.bias of shape (768,) 2022-03-17 13:59:55,933.933 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.weight loaded from bert.encoder.tag_blocks.1.attn.proj.weight of shape (768, 768) 2022-03-17 13:59:55,933.933 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.bias loaded from bert.encoder.tag_blocks.1.attn.qkv.bias of shape (2304,) 2022-03-17 13:59:55,933.933 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.weight loaded from bert.encoder.tag_blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:59:55,933.933 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.bias loaded from bert.encoder.tag_blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-17 13:59:55,933.933 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.weight loaded from bert.encoder.tag_blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:59:55,933.933 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.bias loaded from bert.encoder.tag_blocks.1.mlp.fc2.bias of shape (768,) 2022-03-17 13:59:55,933.933 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.weight loaded from bert.encoder.tag_blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:59:55,933.933 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.bias loaded from bert.encoder.tag_blocks.1.norm1.bias of shape (768,) 2022-03-17 13:59:55,934.934 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.weight loaded from bert.encoder.tag_blocks.1.norm1.weight of shape (768,) 2022-03-17 13:59:55,934.934 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.bias loaded from bert.encoder.tag_blocks.1.norm2.bias of shape (768,) 2022-03-17 13:59:55,934.934 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.weight loaded from bert.encoder.tag_blocks.1.norm2.weight of shape (768,) 2022-03-17 13:59:55,934.934 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.bias loaded from bert.encoder.tag_blocks.2.attn.proj.bias of shape (768,) 2022-03-17 13:59:55,934.934 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.weight loaded from bert.encoder.tag_blocks.2.attn.proj.weight of shape (768, 768) 2022-03-17 13:59:55,934.934 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.bias loaded from bert.encoder.tag_blocks.2.attn.qkv.bias of shape (2304,) 2022-03-17 13:59:55,934.934 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.weight loaded from bert.encoder.tag_blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:59:55,934.934 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.bias loaded from bert.encoder.tag_blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-17 13:59:55,934.934 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.weight loaded from bert.encoder.tag_blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:59:55,934.934 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.bias loaded from bert.encoder.tag_blocks.2.mlp.fc2.bias of shape (768,) 2022-03-17 13:59:55,934.934 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.weight loaded from bert.encoder.tag_blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:59:55,934.934 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.bias loaded from bert.encoder.tag_blocks.2.norm1.bias of shape (768,) 2022-03-17 13:59:55,934.934 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.weight loaded from bert.encoder.tag_blocks.2.norm1.weight of shape (768,) 2022-03-17 13:59:55,935.935 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.bias loaded from bert.encoder.tag_blocks.2.norm2.bias of shape (768,) 2022-03-17 13:59:55,935.935 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.weight loaded from bert.encoder.tag_blocks.2.norm2.weight of shape (768,) 2022-03-17 13:59:55,935.935 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.bias loaded from bert.encoder.tag_blocks.3.attn.proj.bias of shape (768,) 2022-03-17 13:59:55,935.935 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.weight loaded from bert.encoder.tag_blocks.3.attn.proj.weight of shape (768, 768) 2022-03-17 13:59:55,935.935 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.bias loaded from bert.encoder.tag_blocks.3.attn.qkv.bias of shape (2304,) 2022-03-17 13:59:55,935.935 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.weight loaded from bert.encoder.tag_blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-17 13:59:55,935.935 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.bias loaded from bert.encoder.tag_blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-17 13:59:55,935.935 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.weight loaded from bert.encoder.tag_blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-17 13:59:55,935.935 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.bias loaded from bert.encoder.tag_blocks.3.mlp.fc2.bias of shape (768,) 2022-03-17 13:59:55,935.935 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.weight loaded from bert.encoder.tag_blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-17 13:59:55,935.935 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.bias loaded from bert.encoder.tag_blocks.3.norm1.bias of shape (768,) 2022-03-17 13:59:55,935.935 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.weight loaded from bert.encoder.tag_blocks.3.norm1.weight of shape (768,) 2022-03-17 13:59:55,935.935 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.bias loaded from bert.encoder.tag_blocks.3.norm2.bias of shape (768,) 2022-03-17 13:59:55,935.935 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.weight loaded from bert.encoder.tag_blocks.3.norm2.weight of shape (768,) 2022-03-17 13:59:55,936.936 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.bias loaded from bert.extra_embeddings.LayerNorm.bias of shape (768,) 2022-03-17 13:59:55,936.936 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.weight loaded from bert.extra_embeddings.LayerNorm.weight of shape (768,) 2022-03-17 13:59:55,936.936 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.position_embeddings.weight loaded from bert.extra_embeddings.position_embeddings.weight of shape (512, 768) 2022-03-17 13:59:55,936.936 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.token_type_embeddings.weight loaded from bert.extra_embeddings.token_type_embeddings.weight of shape (2, 768) 2022-03-17 13:59:55,936.936 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.word_embeddings.weight loaded from bert.extra_embeddings.word_embeddings.weight of shape (30522, 768) 2022-03-17 13:59:55,936.936 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.bias loaded from bert.pooler.dense.bias of shape (768,) 2022-03-17 13:59:55,936.936 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.weight loaded from bert.pooler.dense.weight of shape (768, 768) 2022-03-17 13:59:55,936.936 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.bias loaded from bert.tag_logit.predictions.bias of shape (30522,) 2022-03-17 13:59:55,936.936 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.decoder.weight loaded from bert.tag_logit.predictions.decoder.weight of shape (30522, 768) 2022-03-17 13:59:55,936.936 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.bias loaded from bert.tag_logit.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-17 13:59:55,936.936 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.weight loaded from bert.tag_logit.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-17 13:59:55,936.936 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.bias loaded from bert.tag_logit.predictions.transform.dense.bias of shape (768,) 2022-03-17 13:59:55,936.936 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.weight loaded from bert.tag_logit.predictions.transform.dense.weight of shape (768, 768) 2022-03-17 13:59:55,937.937 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.bias loaded from cls.predictions.bias of shape (30522,) 2022-03-17 13:59:55,937.937 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.decoder.weight loaded from cls.predictions.decoder.weight of shape (30522, 768) 2022-03-17 13:59:55,937.937 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.bias loaded from cls.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-17 13:59:55,937.937 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.weight loaded from cls.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-17 13:59:55,937.937 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.bias loaded from cls.predictions.transform.dense.bias of shape (768,) 2022-03-17 13:59:55,937.937 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.weight loaded from cls.predictions.transform.dense.weight of shape (768, 768) 2022-03-17 13:59:55,937.937 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 288; loaded = 288 2022-03-17 13:59:55,937.937 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = [] 2022-03-17 13:59:55,940.940 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = ['module.cls.predictions.decoder.weight']; total = 1 2022-03-17 13:59:56,100.100 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.cls.predictions.decoder.weight'] 2022-03-17 13:59:56,435.435 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = ; data = TaxCocoCaption; split = test; add_key = False; backend = cv; hold_buffer = 0; save_original = False 2022-03-17 13:59:56,463.463 2829:uni_pipeline.py:509 get_data_loader(): sampler = 2022-03-17 13:59:56,464.464 2829:uni_pipeline.py:1000 predict(): writing output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0055000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv 2022-03-17 13:59:56,540.540 2829:uni_pipeline.py:907 predict_iter(): DatasetPlusTransform(dataset=, transform=Compose( Compose( ImageTransform2Dict(image_transform=Compose( ToPILImage() Resize(size=384, interpolation=PIL.Image.BICUBIC) CenterCrop(size=(384, 384)) ToTensor() Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) )) ) LoadLabel(data=TaxCocoCaption, split=test, version=vinvl) TransCaptionTensorizer(tensorizer=, pad_to_max=True, pad_image_to_max=True) )) uni_pipeline.py:908: 0%| | 0/10 [00:00 locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi pip.monitor_train() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train need_wait_models = self.pred_eval_intermediate_models() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models pred = self.ensure_predict(model_file=model_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict self.predict(model_file, predict_result_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict model = self.get_model(is_train=False) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. --- Logging error --- Traceback (most recent call last): File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit msg = self.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format return fmt.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format record.message = record.getMessage() File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage msg = str(self.msg) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__ return str(self.to_json_string()) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n" File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps **kw).encode(obj) File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode chunks = list(chunks) File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode yield from _iterencode_dict(o, _current_indent_level) File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict yield from chunks File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode o = _default(o) File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default raise TypeError(f'Object of type {o.__class__.__name__} ' TypeError: Object of type BertTokenizer is not JSON serializable Call stack: File "src/qd/pipeline.py", line 1368, in locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi pip.monitor_train() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train need_wait_models = self.pred_eval_intermediate_models() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models pred = self.ensure_predict(model_file=model_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict self.predict(model_file, predict_result_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict model = self.get_model(is_train=False) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. 2022-03-17 14:02:11,526.526 2829:modeling_bert.py:529 __init__(): TIMM Split image encoder load from pre-trained: True 2022-03-17 14:02:12,599.599 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 14:02:14,102.102 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 14:02:14,728.728 2829:modeling_bert.py:2677 __init__(): BertImgModel Image Dimension: 2054 2022-03-17 14:02:14,876.876 2829:modeling_bert.py:2688 __init__(): BertImgModel Image Dimension: 2054 2022-03-17 14:02:15,646.646 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight! Note that this might be replaced by pre-trained checkpoint later! 2022-03-17 14:02:15,646.646 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode. 2022-03-17 14:02:16,736.736 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 14:02:17,294.294 2829:checkpoint.py:240 load(): Loading checkpoint from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt 2022-03-17 14:02:23,632.632 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.cls_token loaded from image_encoder.module.cls_token of shape (1, 1, 768) 2022-03-17 14:02:23,632.632 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.bias loaded from image_encoder.module.head.bias of shape (1000,) 2022-03-17 14:02:23,632.632 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.weight loaded from image_encoder.module.head.weight of shape (1000, 768) 2022-03-17 14:02:23,632.632 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.bias loaded from image_encoder.module.patch_embed.proj.bias of shape (768,) 2022-03-17 14:02:23,632.632 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.weight loaded from image_encoder.module.patch_embed.proj.weight of shape (768, 3, 16, 16) 2022-03-17 14:02:23,632.632 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.pos_embed loaded from image_encoder.module.pos_embed of shape (1, 577, 768) 2022-03-17 14:02:23,633.633 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.bias loaded from bert.caption_pooler.dense.bias of shape (768,) 2022-03-17 14:02:23,633.633 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.weight loaded from bert.caption_pooler.dense.weight of shape (768, 768) 2022-03-17 14:02:23,633.633 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.bias loaded from bert.decoder.layer.0.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 14:02:23,633.633 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.weight loaded from bert.decoder.layer.0.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 14:02:23,633.633 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.bias loaded from bert.decoder.layer.0.attention.output.dense.bias of shape (768,) 2022-03-17 14:02:23,633.633 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.weight loaded from bert.decoder.layer.0.attention.output.dense.weight of shape (768, 768) 2022-03-17 14:02:23,633.633 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.bias loaded from bert.decoder.layer.0.attention.self.key.bias of shape (768,) 2022-03-17 14:02:23,633.633 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.weight loaded from bert.decoder.layer.0.attention.self.key.weight of shape (768, 768) 2022-03-17 14:02:23,633.633 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.bias loaded from bert.decoder.layer.0.attention.self.query.bias of shape (768,) 2022-03-17 14:02:23,633.633 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.weight loaded from bert.decoder.layer.0.attention.self.query.weight of shape (768, 768) 2022-03-17 14:02:23,633.633 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.bias loaded from bert.decoder.layer.0.attention.self.value.bias of shape (768,) 2022-03-17 14:02:23,633.633 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.weight loaded from bert.decoder.layer.0.attention.self.value.weight of shape (768, 768) 2022-03-17 14:02:23,633.633 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.bias loaded from bert.decoder.layer.0.intermediate.dense.bias of shape (3072,) 2022-03-17 14:02:23,634.634 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.weight loaded from bert.decoder.layer.0.intermediate.dense.weight of shape (3072, 768) 2022-03-17 14:02:23,634.634 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.bias loaded from bert.decoder.layer.0.output.LayerNorm.bias of shape (768,) 2022-03-17 14:02:23,634.634 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.weight loaded from bert.decoder.layer.0.output.LayerNorm.weight of shape (768,) 2022-03-17 14:02:23,634.634 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.bias loaded from bert.decoder.layer.0.output.dense.bias of shape (768,) 2022-03-17 14:02:23,634.634 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.weight loaded from bert.decoder.layer.0.output.dense.weight of shape (768, 3072) 2022-03-17 14:02:23,634.634 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.bias loaded from bert.decoder.layer.1.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 14:02:23,634.634 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.weight loaded from bert.decoder.layer.1.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 14:02:23,634.634 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.bias loaded from bert.decoder.layer.1.attention.output.dense.bias of shape (768,) 2022-03-17 14:02:23,634.634 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.weight loaded from bert.decoder.layer.1.attention.output.dense.weight of shape (768, 768) 2022-03-17 14:02:23,634.634 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.bias loaded from bert.decoder.layer.1.attention.self.key.bias of shape (768,) 2022-03-17 14:02:23,634.634 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.weight loaded from bert.decoder.layer.1.attention.self.key.weight of shape (768, 768) 2022-03-17 14:02:23,634.634 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.bias loaded from bert.decoder.layer.1.attention.self.query.bias of shape (768,) 2022-03-17 14:02:23,634.634 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.weight loaded from bert.decoder.layer.1.attention.self.query.weight of shape (768, 768) 2022-03-17 14:02:23,635.635 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.bias loaded from bert.decoder.layer.1.attention.self.value.bias of shape (768,) 2022-03-17 14:02:23,635.635 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.weight loaded from bert.decoder.layer.1.attention.self.value.weight of shape (768, 768) 2022-03-17 14:02:23,635.635 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.bias loaded from bert.decoder.layer.1.intermediate.dense.bias of shape (3072,) 2022-03-17 14:02:23,635.635 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.weight loaded from bert.decoder.layer.1.intermediate.dense.weight of shape (3072, 768) 2022-03-17 14:02:23,635.635 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.bias loaded from bert.decoder.layer.1.output.LayerNorm.bias of shape (768,) 2022-03-17 14:02:23,635.635 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.weight loaded from bert.decoder.layer.1.output.LayerNorm.weight of shape (768,) 2022-03-17 14:02:23,635.635 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.bias loaded from bert.decoder.layer.1.output.dense.bias of shape (768,) 2022-03-17 14:02:23,635.635 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.weight loaded from bert.decoder.layer.1.output.dense.weight of shape (768, 3072) 2022-03-17 14:02:23,635.635 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.bias loaded from bert.decoder.layer.2.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 14:02:23,635.635 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.weight loaded from bert.decoder.layer.2.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 14:02:23,635.635 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.bias loaded from bert.decoder.layer.2.attention.output.dense.bias of shape (768,) 2022-03-17 14:02:23,635.635 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.weight loaded from bert.decoder.layer.2.attention.output.dense.weight of shape (768, 768) 2022-03-17 14:02:23,636.636 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.bias loaded from bert.decoder.layer.2.attention.self.key.bias of shape (768,) 2022-03-17 14:02:23,636.636 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.weight loaded from bert.decoder.layer.2.attention.self.key.weight of shape (768, 768) 2022-03-17 14:02:23,636.636 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.bias loaded from bert.decoder.layer.2.attention.self.query.bias of shape (768,) 2022-03-17 14:02:23,636.636 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.weight loaded from bert.decoder.layer.2.attention.self.query.weight of shape (768, 768) 2022-03-17 14:02:23,636.636 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.bias loaded from bert.decoder.layer.2.attention.self.value.bias of shape (768,) 2022-03-17 14:02:23,636.636 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.weight loaded from bert.decoder.layer.2.attention.self.value.weight of shape (768, 768) 2022-03-17 14:02:23,636.636 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.bias loaded from bert.decoder.layer.2.intermediate.dense.bias of shape (3072,) 2022-03-17 14:02:23,636.636 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.weight loaded from bert.decoder.layer.2.intermediate.dense.weight of shape (3072, 768) 2022-03-17 14:02:23,636.636 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.bias loaded from bert.decoder.layer.2.output.LayerNorm.bias of shape (768,) 2022-03-17 14:02:23,636.636 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.weight loaded from bert.decoder.layer.2.output.LayerNorm.weight of shape (768,) 2022-03-17 14:02:23,636.636 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.bias loaded from bert.decoder.layer.2.output.dense.bias of shape (768,) 2022-03-17 14:02:23,636.636 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.weight loaded from bert.decoder.layer.2.output.dense.weight of shape (768, 3072) 2022-03-17 14:02:23,636.636 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.bias loaded from bert.decoder.layer.3.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 14:02:23,637.637 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.weight loaded from bert.decoder.layer.3.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 14:02:23,637.637 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.bias loaded from bert.decoder.layer.3.attention.output.dense.bias of shape (768,) 2022-03-17 14:02:23,637.637 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.weight loaded from bert.decoder.layer.3.attention.output.dense.weight of shape (768, 768) 2022-03-17 14:02:23,637.637 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.bias loaded from bert.decoder.layer.3.attention.self.key.bias of shape (768,) 2022-03-17 14:02:23,637.637 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.weight loaded from bert.decoder.layer.3.attention.self.key.weight of shape (768, 768) 2022-03-17 14:02:23,637.637 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.bias loaded from bert.decoder.layer.3.attention.self.query.bias of shape (768,) 2022-03-17 14:02:23,637.637 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.weight loaded from bert.decoder.layer.3.attention.self.query.weight of shape (768, 768) 2022-03-17 14:02:23,637.637 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.bias loaded from bert.decoder.layer.3.attention.self.value.bias of shape (768,) 2022-03-17 14:02:23,637.637 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.weight loaded from bert.decoder.layer.3.attention.self.value.weight of shape (768, 768) 2022-03-17 14:02:23,637.637 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.bias loaded from bert.decoder.layer.3.intermediate.dense.bias of shape (3072,) 2022-03-17 14:02:23,637.637 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.weight loaded from bert.decoder.layer.3.intermediate.dense.weight of shape (3072, 768) 2022-03-17 14:02:23,637.637 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.bias loaded from bert.decoder.layer.3.output.LayerNorm.bias of shape (768,) 2022-03-17 14:02:23,637.637 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.weight loaded from bert.decoder.layer.3.output.LayerNorm.weight of shape (768,) 2022-03-17 14:02:23,637.637 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.bias loaded from bert.decoder.layer.3.output.dense.bias of shape (768,) 2022-03-17 14:02:23,638.638 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.weight loaded from bert.decoder.layer.3.output.dense.weight of shape (768, 3072) 2022-03-17 14:02:23,638.638 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.bias loaded from bert.embeddings.LayerNorm.bias of shape (768,) 2022-03-17 14:02:23,638.638 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.weight loaded from bert.embeddings.LayerNorm.weight of shape (768,) 2022-03-17 14:02:23,638.638 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.position_embeddings.weight loaded from bert.embeddings.position_embeddings.weight of shape (512, 768) 2022-03-17 14:02:23,638.638 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.token_type_embeddings.weight loaded from bert.embeddings.token_type_embeddings.weight of shape (2, 768) 2022-03-17 14:02:23,638.638 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.word_embeddings.weight loaded from bert.embeddings.word_embeddings.weight of shape (30522, 768) 2022-03-17 14:02:23,638.638 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.bias loaded from bert.encoder.blocks.0.attn.proj.bias of shape (768,) 2022-03-17 14:02:23,638.638 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.weight loaded from bert.encoder.blocks.0.attn.proj.weight of shape (768, 768) 2022-03-17 14:02:23,638.638 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.bias loaded from bert.encoder.blocks.0.attn.qkv.bias of shape (2304,) 2022-03-17 14:02:23,638.638 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.weight loaded from bert.encoder.blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:02:23,638.638 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.bias loaded from bert.encoder.blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-17 14:02:23,638.638 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.weight loaded from bert.encoder.blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:02:23,638.638 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.bias loaded from bert.encoder.blocks.0.mlp.fc2.bias of shape (768,) 2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.weight loaded from bert.encoder.blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.bias loaded from bert.encoder.blocks.0.norm1.bias of shape (768,) 2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.weight loaded from bert.encoder.blocks.0.norm1.weight of shape (768,) 2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.bias loaded from bert.encoder.blocks.0.norm2.bias of shape (768,) 2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.weight loaded from bert.encoder.blocks.0.norm2.weight of shape (768,) 2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.bias loaded from bert.encoder.blocks.1.attn.proj.bias of shape (768,) 2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.weight loaded from bert.encoder.blocks.1.attn.proj.weight of shape (768, 768) 2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.bias loaded from bert.encoder.blocks.1.attn.qkv.bias of shape (2304,) 2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.weight loaded from bert.encoder.blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.bias loaded from bert.encoder.blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.weight loaded from bert.encoder.blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.bias loaded from bert.encoder.blocks.1.mlp.fc2.bias of shape (768,) 2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.weight loaded from bert.encoder.blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.bias loaded from bert.encoder.blocks.1.norm1.bias of shape (768,) 2022-03-17 14:02:23,639.639 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.weight loaded from bert.encoder.blocks.1.norm1.weight of shape (768,) 2022-03-17 14:02:23,640.640 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.bias loaded from bert.encoder.blocks.1.norm2.bias of shape (768,) 2022-03-17 14:02:23,640.640 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.weight loaded from bert.encoder.blocks.1.norm2.weight of shape (768,) 2022-03-17 14:02:23,640.640 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.bias loaded from bert.encoder.blocks.10.attn.proj.bias of shape (768,) 2022-03-17 14:02:23,640.640 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.weight loaded from bert.encoder.blocks.10.attn.proj.weight of shape (768, 768) 2022-03-17 14:02:23,640.640 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.bias loaded from bert.encoder.blocks.10.attn.qkv.bias of shape (2304,) 2022-03-17 14:02:23,640.640 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.weight loaded from bert.encoder.blocks.10.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:02:23,640.640 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.bias loaded from bert.encoder.blocks.10.mlp.fc1.bias of shape (3072,) 2022-03-17 14:02:23,640.640 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.weight loaded from bert.encoder.blocks.10.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:02:23,640.640 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.bias loaded from bert.encoder.blocks.10.mlp.fc2.bias of shape (768,) 2022-03-17 14:02:23,640.640 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.weight loaded from bert.encoder.blocks.10.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:02:23,640.640 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.bias loaded from bert.encoder.blocks.10.norm1.bias of shape (768,) 2022-03-17 14:02:23,640.640 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.weight loaded from bert.encoder.blocks.10.norm1.weight of shape (768,) 2022-03-17 14:02:23,640.640 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.bias loaded from bert.encoder.blocks.10.norm2.bias of shape (768,) 2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.weight loaded from bert.encoder.blocks.10.norm2.weight of shape (768,) 2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.bias loaded from bert.encoder.blocks.11.attn.proj.bias of shape (768,) 2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.weight loaded from bert.encoder.blocks.11.attn.proj.weight of shape (768, 768) 2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.bias loaded from bert.encoder.blocks.11.attn.qkv.bias of shape (2304,) 2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.weight loaded from bert.encoder.blocks.11.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.bias loaded from bert.encoder.blocks.11.mlp.fc1.bias of shape (3072,) 2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.weight loaded from bert.encoder.blocks.11.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.bias loaded from bert.encoder.blocks.11.mlp.fc2.bias of shape (768,) 2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.weight loaded from bert.encoder.blocks.11.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.bias loaded from bert.encoder.blocks.11.norm1.bias of shape (768,) 2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.weight loaded from bert.encoder.blocks.11.norm1.weight of shape (768,) 2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.bias loaded from bert.encoder.blocks.11.norm2.bias of shape (768,) 2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.weight loaded from bert.encoder.blocks.11.norm2.weight of shape (768,) 2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.bias loaded from bert.encoder.blocks.2.attn.proj.bias of shape (768,) 2022-03-17 14:02:23,641.641 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.weight loaded from bert.encoder.blocks.2.attn.proj.weight of shape (768, 768) 2022-03-17 14:02:23,642.642 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.bias loaded from bert.encoder.blocks.2.attn.qkv.bias of shape (2304,) 2022-03-17 14:02:23,642.642 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.weight loaded from bert.encoder.blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:02:23,642.642 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.bias loaded from bert.encoder.blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-17 14:02:23,642.642 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.weight loaded from bert.encoder.blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:02:23,642.642 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.bias loaded from bert.encoder.blocks.2.mlp.fc2.bias of shape (768,) 2022-03-17 14:02:23,642.642 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.weight loaded from bert.encoder.blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:02:23,642.642 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.bias loaded from bert.encoder.blocks.2.norm1.bias of shape (768,) 2022-03-17 14:02:23,642.642 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.weight loaded from bert.encoder.blocks.2.norm1.weight of shape (768,) 2022-03-17 14:02:23,642.642 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.bias loaded from bert.encoder.blocks.2.norm2.bias of shape (768,) 2022-03-17 14:02:23,642.642 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.weight loaded from bert.encoder.blocks.2.norm2.weight of shape (768,) 2022-03-17 14:02:23,642.642 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.bias loaded from bert.encoder.blocks.3.attn.proj.bias of shape (768,) 2022-03-17 14:02:23,642.642 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.weight loaded from bert.encoder.blocks.3.attn.proj.weight of shape (768, 768) 2022-03-17 14:02:23,642.642 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.bias loaded from bert.encoder.blocks.3.attn.qkv.bias of shape (2304,) 2022-03-17 14:02:23,642.642 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.weight loaded from bert.encoder.blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:02:23,643.643 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.bias loaded from bert.encoder.blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-17 14:02:23,643.643 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.weight loaded from bert.encoder.blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:02:23,643.643 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.bias loaded from bert.encoder.blocks.3.mlp.fc2.bias of shape (768,) 2022-03-17 14:02:23,643.643 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.weight loaded from bert.encoder.blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:02:23,643.643 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.bias loaded from bert.encoder.blocks.3.norm1.bias of shape (768,) 2022-03-17 14:02:23,643.643 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.weight loaded from bert.encoder.blocks.3.norm1.weight of shape (768,) 2022-03-17 14:02:23,643.643 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.bias loaded from bert.encoder.blocks.3.norm2.bias of shape (768,) 2022-03-17 14:02:23,643.643 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.weight loaded from bert.encoder.blocks.3.norm2.weight of shape (768,) 2022-03-17 14:02:23,643.643 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.bias loaded from bert.encoder.blocks.4.attn.proj.bias of shape (768,) 2022-03-17 14:02:23,643.643 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.weight loaded from bert.encoder.blocks.4.attn.proj.weight of shape (768, 768) 2022-03-17 14:02:23,643.643 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.bias loaded from bert.encoder.blocks.4.attn.qkv.bias of shape (2304,) 2022-03-17 14:02:23,643.643 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.weight loaded from bert.encoder.blocks.4.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:02:23,643.643 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.bias loaded from bert.encoder.blocks.4.mlp.fc1.bias of shape (3072,) 2022-03-17 14:02:23,643.643 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.weight loaded from bert.encoder.blocks.4.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:02:23,644.644 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.bias loaded from bert.encoder.blocks.4.mlp.fc2.bias of shape (768,) 2022-03-17 14:02:23,644.644 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.weight loaded from bert.encoder.blocks.4.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:02:23,644.644 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.bias loaded from bert.encoder.blocks.4.norm1.bias of shape (768,) 2022-03-17 14:02:23,644.644 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.weight loaded from bert.encoder.blocks.4.norm1.weight of shape (768,) 2022-03-17 14:02:23,644.644 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.bias loaded from bert.encoder.blocks.4.norm2.bias of shape (768,) 2022-03-17 14:02:23,644.644 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.weight loaded from bert.encoder.blocks.4.norm2.weight of shape (768,) 2022-03-17 14:02:23,644.644 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.bias loaded from bert.encoder.blocks.5.attn.proj.bias of shape (768,) 2022-03-17 14:02:23,644.644 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.weight loaded from bert.encoder.blocks.5.attn.proj.weight of shape (768, 768) 2022-03-17 14:02:23,644.644 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.bias loaded from bert.encoder.blocks.5.attn.qkv.bias of shape (2304,) 2022-03-17 14:02:23,644.644 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.weight loaded from bert.encoder.blocks.5.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:02:23,644.644 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.bias loaded from bert.encoder.blocks.5.mlp.fc1.bias of shape (3072,) 2022-03-17 14:02:23,644.644 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.weight loaded from bert.encoder.blocks.5.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:02:23,644.644 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.bias loaded from bert.encoder.blocks.5.mlp.fc2.bias of shape (768,) 2022-03-17 14:02:23,644.644 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.weight loaded from bert.encoder.blocks.5.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.bias loaded from bert.encoder.blocks.5.norm1.bias of shape (768,) 2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.weight loaded from bert.encoder.blocks.5.norm1.weight of shape (768,) 2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.bias loaded from bert.encoder.blocks.5.norm2.bias of shape (768,) 2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.weight loaded from bert.encoder.blocks.5.norm2.weight of shape (768,) 2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.bias loaded from bert.encoder.blocks.6.attn.proj.bias of shape (768,) 2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.weight loaded from bert.encoder.blocks.6.attn.proj.weight of shape (768, 768) 2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.bias loaded from bert.encoder.blocks.6.attn.qkv.bias of shape (2304,) 2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.weight loaded from bert.encoder.blocks.6.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.bias loaded from bert.encoder.blocks.6.mlp.fc1.bias of shape (3072,) 2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.weight loaded from bert.encoder.blocks.6.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.bias loaded from bert.encoder.blocks.6.mlp.fc2.bias of shape (768,) 2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.weight loaded from bert.encoder.blocks.6.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.bias loaded from bert.encoder.blocks.6.norm1.bias of shape (768,) 2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.weight loaded from bert.encoder.blocks.6.norm1.weight of shape (768,) 2022-03-17 14:02:23,645.645 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.bias loaded from bert.encoder.blocks.6.norm2.bias of shape (768,) 2022-03-17 14:02:23,646.646 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.weight loaded from bert.encoder.blocks.6.norm2.weight of shape (768,) 2022-03-17 14:02:23,646.646 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.bias loaded from bert.encoder.blocks.7.attn.proj.bias of shape (768,) 2022-03-17 14:02:23,646.646 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.weight loaded from bert.encoder.blocks.7.attn.proj.weight of shape (768, 768) 2022-03-17 14:02:23,646.646 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.bias loaded from bert.encoder.blocks.7.attn.qkv.bias of shape (2304,) 2022-03-17 14:02:23,646.646 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.weight loaded from bert.encoder.blocks.7.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:02:23,646.646 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.bias loaded from bert.encoder.blocks.7.mlp.fc1.bias of shape (3072,) 2022-03-17 14:02:23,646.646 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.weight loaded from bert.encoder.blocks.7.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:02:23,646.646 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.bias loaded from bert.encoder.blocks.7.mlp.fc2.bias of shape (768,) 2022-03-17 14:02:23,646.646 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.weight loaded from bert.encoder.blocks.7.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:02:23,646.646 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.bias loaded from bert.encoder.blocks.7.norm1.bias of shape (768,) 2022-03-17 14:02:23,646.646 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.weight loaded from bert.encoder.blocks.7.norm1.weight of shape (768,) 2022-03-17 14:02:23,646.646 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.bias loaded from bert.encoder.blocks.7.norm2.bias of shape (768,) 2022-03-17 14:02:23,646.646 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.weight loaded from bert.encoder.blocks.7.norm2.weight of shape (768,) 2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.bias loaded from bert.encoder.blocks.8.attn.proj.bias of shape (768,) 2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.weight loaded from bert.encoder.blocks.8.attn.proj.weight of shape (768, 768) 2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.bias loaded from bert.encoder.blocks.8.attn.qkv.bias of shape (2304,) 2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.weight loaded from bert.encoder.blocks.8.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.bias loaded from bert.encoder.blocks.8.mlp.fc1.bias of shape (3072,) 2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.weight loaded from bert.encoder.blocks.8.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.bias loaded from bert.encoder.blocks.8.mlp.fc2.bias of shape (768,) 2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.weight loaded from bert.encoder.blocks.8.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.bias loaded from bert.encoder.blocks.8.norm1.bias of shape (768,) 2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.weight loaded from bert.encoder.blocks.8.norm1.weight of shape (768,) 2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.bias loaded from bert.encoder.blocks.8.norm2.bias of shape (768,) 2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.weight loaded from bert.encoder.blocks.8.norm2.weight of shape (768,) 2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.bias loaded from bert.encoder.blocks.9.attn.proj.bias of shape (768,) 2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.weight loaded from bert.encoder.blocks.9.attn.proj.weight of shape (768, 768) 2022-03-17 14:02:23,647.647 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.bias loaded from bert.encoder.blocks.9.attn.qkv.bias of shape (2304,) 2022-03-17 14:02:23,648.648 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.weight loaded from bert.encoder.blocks.9.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:02:23,648.648 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.bias loaded from bert.encoder.blocks.9.mlp.fc1.bias of shape (3072,) 2022-03-17 14:02:23,648.648 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.weight loaded from bert.encoder.blocks.9.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:02:23,648.648 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.bias loaded from bert.encoder.blocks.9.mlp.fc2.bias of shape (768,) 2022-03-17 14:02:23,648.648 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.weight loaded from bert.encoder.blocks.9.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:02:23,648.648 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.bias loaded from bert.encoder.blocks.9.norm1.bias of shape (768,) 2022-03-17 14:02:23,648.648 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.weight loaded from bert.encoder.blocks.9.norm1.weight of shape (768,) 2022-03-17 14:02:23,648.648 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.bias loaded from bert.encoder.blocks.9.norm2.bias of shape (768,) 2022-03-17 14:02:23,648.648 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.weight loaded from bert.encoder.blocks.9.norm2.weight of shape (768,) 2022-03-17 14:02:23,648.648 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.bias loaded from bert.encoder.tag_blocks.0.attn.proj.bias of shape (768,) 2022-03-17 14:02:23,648.648 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.weight loaded from bert.encoder.tag_blocks.0.attn.proj.weight of shape (768, 768) 2022-03-17 14:02:23,648.648 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.bias loaded from bert.encoder.tag_blocks.0.attn.qkv.bias of shape (2304,) 2022-03-17 14:02:23,648.648 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.weight loaded from bert.encoder.tag_blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.bias loaded from bert.encoder.tag_blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.weight loaded from bert.encoder.tag_blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.bias loaded from bert.encoder.tag_blocks.0.mlp.fc2.bias of shape (768,) 2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.weight loaded from bert.encoder.tag_blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.bias loaded from bert.encoder.tag_blocks.0.norm1.bias of shape (768,) 2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.weight loaded from bert.encoder.tag_blocks.0.norm1.weight of shape (768,) 2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.bias loaded from bert.encoder.tag_blocks.0.norm2.bias of shape (768,) 2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.weight loaded from bert.encoder.tag_blocks.0.norm2.weight of shape (768,) 2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.bias loaded from bert.encoder.tag_blocks.1.attn.proj.bias of shape (768,) 2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.weight loaded from bert.encoder.tag_blocks.1.attn.proj.weight of shape (768, 768) 2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.bias loaded from bert.encoder.tag_blocks.1.attn.qkv.bias of shape (2304,) 2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.weight loaded from bert.encoder.tag_blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.bias loaded from bert.encoder.tag_blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.weight loaded from bert.encoder.tag_blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:02:23,649.649 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.bias loaded from bert.encoder.tag_blocks.1.mlp.fc2.bias of shape (768,) 2022-03-17 14:02:23,650.650 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.weight loaded from bert.encoder.tag_blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:02:23,650.650 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.bias loaded from bert.encoder.tag_blocks.1.norm1.bias of shape (768,) 2022-03-17 14:02:23,650.650 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.weight loaded from bert.encoder.tag_blocks.1.norm1.weight of shape (768,) 2022-03-17 14:02:23,650.650 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.bias loaded from bert.encoder.tag_blocks.1.norm2.bias of shape (768,) 2022-03-17 14:02:23,650.650 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.weight loaded from bert.encoder.tag_blocks.1.norm2.weight of shape (768,) 2022-03-17 14:02:23,650.650 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.bias loaded from bert.encoder.tag_blocks.2.attn.proj.bias of shape (768,) 2022-03-17 14:02:23,650.650 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.weight loaded from bert.encoder.tag_blocks.2.attn.proj.weight of shape (768, 768) 2022-03-17 14:02:23,650.650 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.bias loaded from bert.encoder.tag_blocks.2.attn.qkv.bias of shape (2304,) 2022-03-17 14:02:23,650.650 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.weight loaded from bert.encoder.tag_blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:02:23,650.650 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.bias loaded from bert.encoder.tag_blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-17 14:02:23,650.650 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.weight loaded from bert.encoder.tag_blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:02:23,650.650 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.bias loaded from bert.encoder.tag_blocks.2.mlp.fc2.bias of shape (768,) 2022-03-17 14:02:23,650.650 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.weight loaded from bert.encoder.tag_blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:02:23,650.650 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.bias loaded from bert.encoder.tag_blocks.2.norm1.bias of shape (768,) 2022-03-17 14:02:23,651.651 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.weight loaded from bert.encoder.tag_blocks.2.norm1.weight of shape (768,) 2022-03-17 14:02:23,651.651 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.bias loaded from bert.encoder.tag_blocks.2.norm2.bias of shape (768,) 2022-03-17 14:02:23,651.651 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.weight loaded from bert.encoder.tag_blocks.2.norm2.weight of shape (768,) 2022-03-17 14:02:23,651.651 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.bias loaded from bert.encoder.tag_blocks.3.attn.proj.bias of shape (768,) 2022-03-17 14:02:23,651.651 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.weight loaded from bert.encoder.tag_blocks.3.attn.proj.weight of shape (768, 768) 2022-03-17 14:02:23,651.651 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.bias loaded from bert.encoder.tag_blocks.3.attn.qkv.bias of shape (2304,) 2022-03-17 14:02:23,651.651 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.weight loaded from bert.encoder.tag_blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:02:23,651.651 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.bias loaded from bert.encoder.tag_blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-17 14:02:23,651.651 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.weight loaded from bert.encoder.tag_blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:02:23,651.651 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.bias loaded from bert.encoder.tag_blocks.3.mlp.fc2.bias of shape (768,) 2022-03-17 14:02:23,651.651 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.weight loaded from bert.encoder.tag_blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:02:23,651.651 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.bias loaded from bert.encoder.tag_blocks.3.norm1.bias of shape (768,) 2022-03-17 14:02:23,651.651 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.weight loaded from bert.encoder.tag_blocks.3.norm1.weight of shape (768,) 2022-03-17 14:02:23,651.651 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.bias loaded from bert.encoder.tag_blocks.3.norm2.bias of shape (768,) 2022-03-17 14:02:23,652.652 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.weight loaded from bert.encoder.tag_blocks.3.norm2.weight of shape (768,) 2022-03-17 14:02:23,652.652 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.bias loaded from bert.extra_embeddings.LayerNorm.bias of shape (768,) 2022-03-17 14:02:23,652.652 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.weight loaded from bert.extra_embeddings.LayerNorm.weight of shape (768,) 2022-03-17 14:02:23,652.652 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.position_embeddings.weight loaded from bert.extra_embeddings.position_embeddings.weight of shape (512, 768) 2022-03-17 14:02:23,652.652 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.token_type_embeddings.weight loaded from bert.extra_embeddings.token_type_embeddings.weight of shape (2, 768) 2022-03-17 14:02:23,652.652 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.word_embeddings.weight loaded from bert.extra_embeddings.word_embeddings.weight of shape (30522, 768) 2022-03-17 14:02:23,652.652 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.bias loaded from bert.pooler.dense.bias of shape (768,) 2022-03-17 14:02:23,652.652 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.weight loaded from bert.pooler.dense.weight of shape (768, 768) 2022-03-17 14:02:23,652.652 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.bias loaded from bert.tag_logit.predictions.bias of shape (30522,) 2022-03-17 14:02:23,652.652 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.decoder.weight loaded from bert.tag_logit.predictions.decoder.weight of shape (30522, 768) 2022-03-17 14:02:23,652.652 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.bias loaded from bert.tag_logit.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-17 14:02:23,652.652 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.weight loaded from bert.tag_logit.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-17 14:02:23,652.652 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.bias loaded from bert.tag_logit.predictions.transform.dense.bias of shape (768,) 2022-03-17 14:02:23,652.652 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.weight loaded from bert.tag_logit.predictions.transform.dense.weight of shape (768, 768) 2022-03-17 14:02:23,653.653 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.bias loaded from cls.predictions.bias of shape (30522,) 2022-03-17 14:02:23,653.653 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.decoder.weight loaded from cls.predictions.decoder.weight of shape (30522, 768) 2022-03-17 14:02:23,653.653 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.bias loaded from cls.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-17 14:02:23,653.653 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.weight loaded from cls.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-17 14:02:23,653.653 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.bias loaded from cls.predictions.transform.dense.bias of shape (768,) 2022-03-17 14:02:23,653.653 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.weight loaded from cls.predictions.transform.dense.weight of shape (768, 768) 2022-03-17 14:02:23,653.653 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 288; loaded = 288 2022-03-17 14:02:23,653.653 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = [] 2022-03-17 14:02:23,656.656 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = ['module.cls.predictions.decoder.weight']; total = 1 2022-03-17 14:02:23,828.828 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.cls.predictions.decoder.weight'] 2022-03-17 14:02:24,163.163 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = ; data = TaxCocoCaption; split = test; add_key = False; backend = cv; hold_buffer = 0; save_original = False 2022-03-17 14:02:24,189.189 2829:uni_pipeline.py:509 get_data_loader(): sampler = 2022-03-17 14:02:24,189.189 2829:uni_pipeline.py:1000 predict(): writing output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0060000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv 2022-03-17 14:02:24,262.262 2829:uni_pipeline.py:907 predict_iter(): DatasetPlusTransform(dataset=, transform=Compose( Compose( ImageTransform2Dict(image_transform=Compose( ToPILImage() Resize(size=384, interpolation=PIL.Image.BICUBIC) CenterCrop(size=(384, 384)) ToTensor() Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) )) ) LoadLabel(data=TaxCocoCaption, split=test, version=vinvl) TransCaptionTensorizer(tensorizer=, pad_to_max=True, pad_image_to_max=True) )) uni_pipeline.py:908: 0%| | 0/10 [00:00 locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi pip.monitor_train() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train need_wait_models = self.pred_eval_intermediate_models() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models pred = self.ensure_predict(model_file=model_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict self.predict(model_file, predict_result_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict model = self.get_model(is_train=False) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. --- Logging error --- Traceback (most recent call last): File "/opt/conda/lib/python3.7/logging/__init__.py", line 1025, in emit msg = self.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 869, in format return fmt.format(record) File "/opt/conda/lib/python3.7/logging/__init__.py", line 608, in format record.message = record.getMessage() File "/opt/conda/lib/python3.7/logging/__init__.py", line 367, in getMessage msg = str(self.msg) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 236, in __repr__ return str(self.to_json_string()) File "/tmp/code/src/qd/mask/layers/bert/modeling_utils.py", line 245, in to_json_string return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n" File "/opt/conda/lib/python3.7/json/__init__.py", line 238, in dumps **kw).encode(obj) File "/opt/conda/lib/python3.7/json/encoder.py", line 201, in encode chunks = list(chunks) File "/opt/conda/lib/python3.7/json/encoder.py", line 431, in _iterencode yield from _iterencode_dict(o, _current_indent_level) File "/opt/conda/lib/python3.7/json/encoder.py", line 405, in _iterencode_dict yield from chunks File "/opt/conda/lib/python3.7/json/encoder.py", line 438, in _iterencode o = _default(o) File "/opt/conda/lib/python3.7/json/encoder.py", line 179, in default raise TypeError(f'Object of type {o.__class__.__name__} ' TypeError: Object of type BertTokenizer is not JSON serializable Call stack: File "src/qd/pipeline.py", line 1368, in locals()[function_name](**kwargs) File "src/qd/pipeline.py", line 637, in pipeline_train_eval_multi pip.monitor_train() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1224, in monitor_train need_wait_models = self.pred_eval_intermediate_models() File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 1286, in pred_eval_intermediate_models pred = self.ensure_predict(model_file=model_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 886, in ensure_predict self.predict(model_file, predict_result_file) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 989, in predict model = self.get_model(is_train=False) File "/tmp/code/src/qd/pipelines/uni_pipeline.py", line 291, in get_model model = self.get_raw_model(is_train) File "/tmp/code/src/qd/pipelines/tagger_caption_uni_pipeline_expanding.py", line 913, in get_raw_model model = TaggerEncDecSplitForImageCaptioning(config=config) # init from scratch File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 5437, in __init__ self.bert = TaggerEncDecCLSEmbSplitBertImgModel(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 2656, in __init__ self.encoder = TIMMVitSplitEncoder(config) File "/tmp/code/src/qd/mask/layers/bert/modeling_bert.py", line 528, in __init__ logging.info(config) Unable to print the message and arguments - possible formatting error. Use the traceback above to help find the error. 2022-03-17 14:04:42,578.578 2829:modeling_bert.py:529 __init__(): TIMM Split image encoder load from pre-trained: True 2022-03-17 14:04:43,646.646 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 14:04:45,196.196 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 14:04:45,849.849 2829:modeling_bert.py:2677 __init__(): BertImgModel Image Dimension: 2054 2022-03-17 14:04:45,998.998 2829:modeling_bert.py:2688 __init__(): BertImgModel Image Dimension: 2054 2022-03-17 14:04:46,788.788 2829:tagger_caption_uni_pipeline_expanding.py:1130 get_image_encoder_model(): VIT image encoder loaded from pre-trained weight! Note that this might be replaced by pre-trained checkpoint later! 2022-03-17 14:04:46,789.789 2829:tagger_caption_uni_pipeline_expanding.py:1134 get_image_encoder_model(): Non-Patch Selection Mode. 2022-03-17 14:04:47,921.921 2829:helpers.py:270 load_pretrained(): Loading pretrained weights from url (https://github.com/rwightman/pytorch-image-models/releases/download/v0.1-vitjx/jx_vit_base_p16_384-83fb41ba.pth) helpers.py 2022-03-17 14:04:48,456.456 2829:checkpoint.py:240 load(): Loading checkpoint from output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt 2022-03-17 14:04:57,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.cls_token loaded from image_encoder.module.cls_token of shape (1, 1, 768) 2022-03-17 14:04:57,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.bias loaded from image_encoder.module.head.bias of shape (1000,) 2022-03-17 14:04:57,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.head.weight loaded from image_encoder.module.head.weight of shape (1000, 768) 2022-03-17 14:04:57,017.017 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.bias loaded from image_encoder.module.patch_embed.proj.bias of shape (768,) 2022-03-17 14:04:57,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.patch_embed.proj.weight loaded from image_encoder.module.patch_embed.proj.weight of shape (768, 3, 16, 16) 2022-03-17 14:04:57,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): image_encoder.module.pos_embed loaded from image_encoder.module.pos_embed of shape (1, 577, 768) 2022-03-17 14:04:57,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.bias loaded from bert.caption_pooler.dense.bias of shape (768,) 2022-03-17 14:04:57,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.caption_pooler.dense.weight loaded from bert.caption_pooler.dense.weight of shape (768, 768) 2022-03-17 14:04:57,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.bias loaded from bert.decoder.layer.0.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 14:04:57,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.LayerNorm.weight loaded from bert.decoder.layer.0.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 14:04:57,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.bias loaded from bert.decoder.layer.0.attention.output.dense.bias of shape (768,) 2022-03-17 14:04:57,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.output.dense.weight loaded from bert.decoder.layer.0.attention.output.dense.weight of shape (768, 768) 2022-03-17 14:04:57,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.bias loaded from bert.decoder.layer.0.attention.self.key.bias of shape (768,) 2022-03-17 14:04:57,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.key.weight loaded from bert.decoder.layer.0.attention.self.key.weight of shape (768, 768) 2022-03-17 14:04:57,018.018 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.bias loaded from bert.decoder.layer.0.attention.self.query.bias of shape (768,) 2022-03-17 14:04:57,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.query.weight loaded from bert.decoder.layer.0.attention.self.query.weight of shape (768, 768) 2022-03-17 14:04:57,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.bias loaded from bert.decoder.layer.0.attention.self.value.bias of shape (768,) 2022-03-17 14:04:57,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.attention.self.value.weight loaded from bert.decoder.layer.0.attention.self.value.weight of shape (768, 768) 2022-03-17 14:04:57,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.bias loaded from bert.decoder.layer.0.intermediate.dense.bias of shape (3072,) 2022-03-17 14:04:57,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.intermediate.dense.weight loaded from bert.decoder.layer.0.intermediate.dense.weight of shape (3072, 768) 2022-03-17 14:04:57,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.bias loaded from bert.decoder.layer.0.output.LayerNorm.bias of shape (768,) 2022-03-17 14:04:57,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.LayerNorm.weight loaded from bert.decoder.layer.0.output.LayerNorm.weight of shape (768,) 2022-03-17 14:04:57,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.bias loaded from bert.decoder.layer.0.output.dense.bias of shape (768,) 2022-03-17 14:04:57,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.0.output.dense.weight loaded from bert.decoder.layer.0.output.dense.weight of shape (768, 3072) 2022-03-17 14:04:57,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.bias loaded from bert.decoder.layer.1.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 14:04:57,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.LayerNorm.weight loaded from bert.decoder.layer.1.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 14:04:57,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.bias loaded from bert.decoder.layer.1.attention.output.dense.bias of shape (768,) 2022-03-17 14:04:57,019.019 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.output.dense.weight loaded from bert.decoder.layer.1.attention.output.dense.weight of shape (768, 768) 2022-03-17 14:04:57,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.bias loaded from bert.decoder.layer.1.attention.self.key.bias of shape (768,) 2022-03-17 14:04:57,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.key.weight loaded from bert.decoder.layer.1.attention.self.key.weight of shape (768, 768) 2022-03-17 14:04:57,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.bias loaded from bert.decoder.layer.1.attention.self.query.bias of shape (768,) 2022-03-17 14:04:57,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.query.weight loaded from bert.decoder.layer.1.attention.self.query.weight of shape (768, 768) 2022-03-17 14:04:57,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.bias loaded from bert.decoder.layer.1.attention.self.value.bias of shape (768,) 2022-03-17 14:04:57,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.attention.self.value.weight loaded from bert.decoder.layer.1.attention.self.value.weight of shape (768, 768) 2022-03-17 14:04:57,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.bias loaded from bert.decoder.layer.1.intermediate.dense.bias of shape (3072,) 2022-03-17 14:04:57,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.intermediate.dense.weight loaded from bert.decoder.layer.1.intermediate.dense.weight of shape (3072, 768) 2022-03-17 14:04:57,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.bias loaded from bert.decoder.layer.1.output.LayerNorm.bias of shape (768,) 2022-03-17 14:04:57,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.LayerNorm.weight loaded from bert.decoder.layer.1.output.LayerNorm.weight of shape (768,) 2022-03-17 14:04:57,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.bias loaded from bert.decoder.layer.1.output.dense.bias of shape (768,) 2022-03-17 14:04:57,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.1.output.dense.weight loaded from bert.decoder.layer.1.output.dense.weight of shape (768, 3072) 2022-03-17 14:04:57,020.020 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.bias loaded from bert.decoder.layer.2.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 14:04:57,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.LayerNorm.weight loaded from bert.decoder.layer.2.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 14:04:57,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.bias loaded from bert.decoder.layer.2.attention.output.dense.bias of shape (768,) 2022-03-17 14:04:57,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.output.dense.weight loaded from bert.decoder.layer.2.attention.output.dense.weight of shape (768, 768) 2022-03-17 14:04:57,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.bias loaded from bert.decoder.layer.2.attention.self.key.bias of shape (768,) 2022-03-17 14:04:57,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.key.weight loaded from bert.decoder.layer.2.attention.self.key.weight of shape (768, 768) 2022-03-17 14:04:57,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.bias loaded from bert.decoder.layer.2.attention.self.query.bias of shape (768,) 2022-03-17 14:04:57,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.query.weight loaded from bert.decoder.layer.2.attention.self.query.weight of shape (768, 768) 2022-03-17 14:04:57,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.bias loaded from bert.decoder.layer.2.attention.self.value.bias of shape (768,) 2022-03-17 14:04:57,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.attention.self.value.weight loaded from bert.decoder.layer.2.attention.self.value.weight of shape (768, 768) 2022-03-17 14:04:57,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.bias loaded from bert.decoder.layer.2.intermediate.dense.bias of shape (3072,) 2022-03-17 14:04:57,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.intermediate.dense.weight loaded from bert.decoder.layer.2.intermediate.dense.weight of shape (3072, 768) 2022-03-17 14:04:57,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.bias loaded from bert.decoder.layer.2.output.LayerNorm.bias of shape (768,) 2022-03-17 14:04:57,021.021 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.LayerNorm.weight loaded from bert.decoder.layer.2.output.LayerNorm.weight of shape (768,) 2022-03-17 14:04:57,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.bias loaded from bert.decoder.layer.2.output.dense.bias of shape (768,) 2022-03-17 14:04:57,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.2.output.dense.weight loaded from bert.decoder.layer.2.output.dense.weight of shape (768, 3072) 2022-03-17 14:04:57,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.bias loaded from bert.decoder.layer.3.attention.output.LayerNorm.bias of shape (768,) 2022-03-17 14:04:57,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.LayerNorm.weight loaded from bert.decoder.layer.3.attention.output.LayerNorm.weight of shape (768,) 2022-03-17 14:04:57,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.bias loaded from bert.decoder.layer.3.attention.output.dense.bias of shape (768,) 2022-03-17 14:04:57,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.output.dense.weight loaded from bert.decoder.layer.3.attention.output.dense.weight of shape (768, 768) 2022-03-17 14:04:57,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.bias loaded from bert.decoder.layer.3.attention.self.key.bias of shape (768,) 2022-03-17 14:04:57,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.key.weight loaded from bert.decoder.layer.3.attention.self.key.weight of shape (768, 768) 2022-03-17 14:04:57,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.bias loaded from bert.decoder.layer.3.attention.self.query.bias of shape (768,) 2022-03-17 14:04:57,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.query.weight loaded from bert.decoder.layer.3.attention.self.query.weight of shape (768, 768) 2022-03-17 14:04:57,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.bias loaded from bert.decoder.layer.3.attention.self.value.bias of shape (768,) 2022-03-17 14:04:57,022.022 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.attention.self.value.weight loaded from bert.decoder.layer.3.attention.self.value.weight of shape (768, 768) 2022-03-17 14:04:57,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.bias loaded from bert.decoder.layer.3.intermediate.dense.bias of shape (3072,) 2022-03-17 14:04:57,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.intermediate.dense.weight loaded from bert.decoder.layer.3.intermediate.dense.weight of shape (3072, 768) 2022-03-17 14:04:57,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.bias loaded from bert.decoder.layer.3.output.LayerNorm.bias of shape (768,) 2022-03-17 14:04:57,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.LayerNorm.weight loaded from bert.decoder.layer.3.output.LayerNorm.weight of shape (768,) 2022-03-17 14:04:57,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.bias loaded from bert.decoder.layer.3.output.dense.bias of shape (768,) 2022-03-17 14:04:57,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.decoder.layer.3.output.dense.weight loaded from bert.decoder.layer.3.output.dense.weight of shape (768, 3072) 2022-03-17 14:04:57,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.bias loaded from bert.embeddings.LayerNorm.bias of shape (768,) 2022-03-17 14:04:57,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.LayerNorm.weight loaded from bert.embeddings.LayerNorm.weight of shape (768,) 2022-03-17 14:04:57,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.position_embeddings.weight loaded from bert.embeddings.position_embeddings.weight of shape (512, 768) 2022-03-17 14:04:57,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.token_type_embeddings.weight loaded from bert.embeddings.token_type_embeddings.weight of shape (2, 768) 2022-03-17 14:04:57,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.embeddings.word_embeddings.weight loaded from bert.embeddings.word_embeddings.weight of shape (30522, 768) 2022-03-17 14:04:57,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.bias loaded from bert.encoder.blocks.0.attn.proj.bias of shape (768,) 2022-03-17 14:04:57,023.023 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.proj.weight loaded from bert.encoder.blocks.0.attn.proj.weight of shape (768, 768) 2022-03-17 14:04:57,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.bias loaded from bert.encoder.blocks.0.attn.qkv.bias of shape (2304,) 2022-03-17 14:04:57,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.attn.qkv.weight loaded from bert.encoder.blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:04:57,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.bias loaded from bert.encoder.blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-17 14:04:57,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc1.weight loaded from bert.encoder.blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:04:57,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.bias loaded from bert.encoder.blocks.0.mlp.fc2.bias of shape (768,) 2022-03-17 14:04:57,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.mlp.fc2.weight loaded from bert.encoder.blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:04:57,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.bias loaded from bert.encoder.blocks.0.norm1.bias of shape (768,) 2022-03-17 14:04:57,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm1.weight loaded from bert.encoder.blocks.0.norm1.weight of shape (768,) 2022-03-17 14:04:57,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.bias loaded from bert.encoder.blocks.0.norm2.bias of shape (768,) 2022-03-17 14:04:57,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.0.norm2.weight loaded from bert.encoder.blocks.0.norm2.weight of shape (768,) 2022-03-17 14:04:57,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.bias loaded from bert.encoder.blocks.1.attn.proj.bias of shape (768,) 2022-03-17 14:04:57,024.024 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.proj.weight loaded from bert.encoder.blocks.1.attn.proj.weight of shape (768, 768) 2022-03-17 14:04:57,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.bias loaded from bert.encoder.blocks.1.attn.qkv.bias of shape (2304,) 2022-03-17 14:04:57,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.attn.qkv.weight loaded from bert.encoder.blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:04:57,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.bias loaded from bert.encoder.blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-17 14:04:57,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc1.weight loaded from bert.encoder.blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:04:57,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.bias loaded from bert.encoder.blocks.1.mlp.fc2.bias of shape (768,) 2022-03-17 14:04:57,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.mlp.fc2.weight loaded from bert.encoder.blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:04:57,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.bias loaded from bert.encoder.blocks.1.norm1.bias of shape (768,) 2022-03-17 14:04:57,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm1.weight loaded from bert.encoder.blocks.1.norm1.weight of shape (768,) 2022-03-17 14:04:57,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.bias loaded from bert.encoder.blocks.1.norm2.bias of shape (768,) 2022-03-17 14:04:57,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.1.norm2.weight loaded from bert.encoder.blocks.1.norm2.weight of shape (768,) 2022-03-17 14:04:57,025.025 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.bias loaded from bert.encoder.blocks.10.attn.proj.bias of shape (768,) 2022-03-17 14:04:57,026.026 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.proj.weight loaded from bert.encoder.blocks.10.attn.proj.weight of shape (768, 768) 2022-03-17 14:04:57,026.026 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.bias loaded from bert.encoder.blocks.10.attn.qkv.bias of shape (2304,) 2022-03-17 14:04:57,026.026 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.attn.qkv.weight loaded from bert.encoder.blocks.10.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:04:57,026.026 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.bias loaded from bert.encoder.blocks.10.mlp.fc1.bias of shape (3072,) 2022-03-17 14:04:57,026.026 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc1.weight loaded from bert.encoder.blocks.10.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:04:57,026.026 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.bias loaded from bert.encoder.blocks.10.mlp.fc2.bias of shape (768,) 2022-03-17 14:04:57,026.026 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.mlp.fc2.weight loaded from bert.encoder.blocks.10.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:04:57,026.026 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.bias loaded from bert.encoder.blocks.10.norm1.bias of shape (768,) 2022-03-17 14:04:57,026.026 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm1.weight loaded from bert.encoder.blocks.10.norm1.weight of shape (768,) 2022-03-17 14:04:57,026.026 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.bias loaded from bert.encoder.blocks.10.norm2.bias of shape (768,) 2022-03-17 14:04:57,026.026 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.10.norm2.weight loaded from bert.encoder.blocks.10.norm2.weight of shape (768,) 2022-03-17 14:04:57,027.027 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.bias loaded from bert.encoder.blocks.11.attn.proj.bias of shape (768,) 2022-03-17 14:04:57,027.027 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.proj.weight loaded from bert.encoder.blocks.11.attn.proj.weight of shape (768, 768) 2022-03-17 14:04:57,027.027 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.bias loaded from bert.encoder.blocks.11.attn.qkv.bias of shape (2304,) 2022-03-17 14:04:57,027.027 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.attn.qkv.weight loaded from bert.encoder.blocks.11.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:04:57,027.027 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.bias loaded from bert.encoder.blocks.11.mlp.fc1.bias of shape (3072,) 2022-03-17 14:04:57,027.027 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc1.weight loaded from bert.encoder.blocks.11.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:04:57,027.027 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.bias loaded from bert.encoder.blocks.11.mlp.fc2.bias of shape (768,) 2022-03-17 14:04:57,027.027 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.mlp.fc2.weight loaded from bert.encoder.blocks.11.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:04:57,027.027 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.bias loaded from bert.encoder.blocks.11.norm1.bias of shape (768,) 2022-03-17 14:04:57,027.027 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm1.weight loaded from bert.encoder.blocks.11.norm1.weight of shape (768,) 2022-03-17 14:04:57,027.027 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.bias loaded from bert.encoder.blocks.11.norm2.bias of shape (768,) 2022-03-17 14:04:57,027.027 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.11.norm2.weight loaded from bert.encoder.blocks.11.norm2.weight of shape (768,) 2022-03-17 14:04:57,028.028 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.bias loaded from bert.encoder.blocks.2.attn.proj.bias of shape (768,) 2022-03-17 14:04:57,028.028 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.proj.weight loaded from bert.encoder.blocks.2.attn.proj.weight of shape (768, 768) 2022-03-17 14:04:57,028.028 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.bias loaded from bert.encoder.blocks.2.attn.qkv.bias of shape (2304,) 2022-03-17 14:04:57,028.028 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.attn.qkv.weight loaded from bert.encoder.blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:04:57,028.028 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.bias loaded from bert.encoder.blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-17 14:04:57,028.028 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc1.weight loaded from bert.encoder.blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:04:57,028.028 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.bias loaded from bert.encoder.blocks.2.mlp.fc2.bias of shape (768,) 2022-03-17 14:04:57,028.028 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.mlp.fc2.weight loaded from bert.encoder.blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:04:57,028.028 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.bias loaded from bert.encoder.blocks.2.norm1.bias of shape (768,) 2022-03-17 14:04:57,028.028 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm1.weight loaded from bert.encoder.blocks.2.norm1.weight of shape (768,) 2022-03-17 14:04:57,028.028 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.bias loaded from bert.encoder.blocks.2.norm2.bias of shape (768,) 2022-03-17 14:04:57,028.028 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.2.norm2.weight loaded from bert.encoder.blocks.2.norm2.weight of shape (768,) 2022-03-17 14:04:57,028.028 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.bias loaded from bert.encoder.blocks.3.attn.proj.bias of shape (768,) 2022-03-17 14:04:57,028.028 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.proj.weight loaded from bert.encoder.blocks.3.attn.proj.weight of shape (768, 768) 2022-03-17 14:04:57,029.029 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.bias loaded from bert.encoder.blocks.3.attn.qkv.bias of shape (2304,) 2022-03-17 14:04:57,029.029 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.attn.qkv.weight loaded from bert.encoder.blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:04:57,029.029 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.bias loaded from bert.encoder.blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-17 14:04:57,029.029 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc1.weight loaded from bert.encoder.blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:04:57,029.029 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.bias loaded from bert.encoder.blocks.3.mlp.fc2.bias of shape (768,) 2022-03-17 14:04:57,029.029 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.mlp.fc2.weight loaded from bert.encoder.blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:04:57,029.029 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.bias loaded from bert.encoder.blocks.3.norm1.bias of shape (768,) 2022-03-17 14:04:57,029.029 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm1.weight loaded from bert.encoder.blocks.3.norm1.weight of shape (768,) 2022-03-17 14:04:57,029.029 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.bias loaded from bert.encoder.blocks.3.norm2.bias of shape (768,) 2022-03-17 14:04:57,029.029 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.3.norm2.weight loaded from bert.encoder.blocks.3.norm2.weight of shape (768,) 2022-03-17 14:04:57,029.029 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.bias loaded from bert.encoder.blocks.4.attn.proj.bias of shape (768,) 2022-03-17 14:04:57,029.029 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.proj.weight loaded from bert.encoder.blocks.4.attn.proj.weight of shape (768, 768) 2022-03-17 14:04:57,030.030 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.bias loaded from bert.encoder.blocks.4.attn.qkv.bias of shape (2304,) 2022-03-17 14:04:57,030.030 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.attn.qkv.weight loaded from bert.encoder.blocks.4.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:04:57,030.030 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.bias loaded from bert.encoder.blocks.4.mlp.fc1.bias of shape (3072,) 2022-03-17 14:04:57,030.030 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc1.weight loaded from bert.encoder.blocks.4.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:04:57,030.030 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.bias loaded from bert.encoder.blocks.4.mlp.fc2.bias of shape (768,) 2022-03-17 14:04:57,030.030 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.mlp.fc2.weight loaded from bert.encoder.blocks.4.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:04:57,030.030 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.bias loaded from bert.encoder.blocks.4.norm1.bias of shape (768,) 2022-03-17 14:04:57,030.030 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm1.weight loaded from bert.encoder.blocks.4.norm1.weight of shape (768,) 2022-03-17 14:04:57,030.030 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.bias loaded from bert.encoder.blocks.4.norm2.bias of shape (768,) 2022-03-17 14:04:57,030.030 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.4.norm2.weight loaded from bert.encoder.blocks.4.norm2.weight of shape (768,) 2022-03-17 14:04:57,030.030 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.bias loaded from bert.encoder.blocks.5.attn.proj.bias of shape (768,) 2022-03-17 14:04:57,030.030 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.proj.weight loaded from bert.encoder.blocks.5.attn.proj.weight of shape (768, 768) 2022-03-17 14:04:57,030.030 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.bias loaded from bert.encoder.blocks.5.attn.qkv.bias of shape (2304,) 2022-03-17 14:04:57,031.031 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.attn.qkv.weight loaded from bert.encoder.blocks.5.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:04:57,031.031 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.bias loaded from bert.encoder.blocks.5.mlp.fc1.bias of shape (3072,) 2022-03-17 14:04:57,031.031 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc1.weight loaded from bert.encoder.blocks.5.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:04:57,031.031 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.bias loaded from bert.encoder.blocks.5.mlp.fc2.bias of shape (768,) 2022-03-17 14:04:57,031.031 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.mlp.fc2.weight loaded from bert.encoder.blocks.5.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:04:57,031.031 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.bias loaded from bert.encoder.blocks.5.norm1.bias of shape (768,) 2022-03-17 14:04:57,031.031 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm1.weight loaded from bert.encoder.blocks.5.norm1.weight of shape (768,) 2022-03-17 14:04:57,031.031 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.bias loaded from bert.encoder.blocks.5.norm2.bias of shape (768,) 2022-03-17 14:04:57,031.031 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.5.norm2.weight loaded from bert.encoder.blocks.5.norm2.weight of shape (768,) 2022-03-17 14:04:57,031.031 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.bias loaded from bert.encoder.blocks.6.attn.proj.bias of shape (768,) 2022-03-17 14:04:57,031.031 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.proj.weight loaded from bert.encoder.blocks.6.attn.proj.weight of shape (768, 768) 2022-03-17 14:04:57,031.031 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.bias loaded from bert.encoder.blocks.6.attn.qkv.bias of shape (2304,) 2022-03-17 14:04:57,031.031 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.attn.qkv.weight loaded from bert.encoder.blocks.6.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:04:57,031.031 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.bias loaded from bert.encoder.blocks.6.mlp.fc1.bias of shape (3072,) 2022-03-17 14:04:57,032.032 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc1.weight loaded from bert.encoder.blocks.6.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:04:57,032.032 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.bias loaded from bert.encoder.blocks.6.mlp.fc2.bias of shape (768,) 2022-03-17 14:04:57,032.032 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.mlp.fc2.weight loaded from bert.encoder.blocks.6.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:04:57,032.032 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.bias loaded from bert.encoder.blocks.6.norm1.bias of shape (768,) 2022-03-17 14:04:57,032.032 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm1.weight loaded from bert.encoder.blocks.6.norm1.weight of shape (768,) 2022-03-17 14:04:57,032.032 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.bias loaded from bert.encoder.blocks.6.norm2.bias of shape (768,) 2022-03-17 14:04:57,032.032 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.6.norm2.weight loaded from bert.encoder.blocks.6.norm2.weight of shape (768,) 2022-03-17 14:04:57,032.032 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.bias loaded from bert.encoder.blocks.7.attn.proj.bias of shape (768,) 2022-03-17 14:04:57,032.032 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.proj.weight loaded from bert.encoder.blocks.7.attn.proj.weight of shape (768, 768) 2022-03-17 14:04:57,032.032 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.bias loaded from bert.encoder.blocks.7.attn.qkv.bias of shape (2304,) 2022-03-17 14:04:57,032.032 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.attn.qkv.weight loaded from bert.encoder.blocks.7.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:04:57,032.032 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.bias loaded from bert.encoder.blocks.7.mlp.fc1.bias of shape (3072,) 2022-03-17 14:04:57,033.033 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc1.weight loaded from bert.encoder.blocks.7.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:04:57,033.033 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.bias loaded from bert.encoder.blocks.7.mlp.fc2.bias of shape (768,) 2022-03-17 14:04:57,033.033 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.mlp.fc2.weight loaded from bert.encoder.blocks.7.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:04:57,033.033 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.bias loaded from bert.encoder.blocks.7.norm1.bias of shape (768,) 2022-03-17 14:04:57,033.033 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm1.weight loaded from bert.encoder.blocks.7.norm1.weight of shape (768,) 2022-03-17 14:04:57,033.033 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.bias loaded from bert.encoder.blocks.7.norm2.bias of shape (768,) 2022-03-17 14:04:57,033.033 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.7.norm2.weight loaded from bert.encoder.blocks.7.norm2.weight of shape (768,) 2022-03-17 14:04:57,033.033 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.bias loaded from bert.encoder.blocks.8.attn.proj.bias of shape (768,) 2022-03-17 14:04:57,033.033 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.proj.weight loaded from bert.encoder.blocks.8.attn.proj.weight of shape (768, 768) 2022-03-17 14:04:57,033.033 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.bias loaded from bert.encoder.blocks.8.attn.qkv.bias of shape (2304,) 2022-03-17 14:04:57,033.033 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.attn.qkv.weight loaded from bert.encoder.blocks.8.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:04:57,033.033 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.bias loaded from bert.encoder.blocks.8.mlp.fc1.bias of shape (3072,) 2022-03-17 14:04:57,033.033 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc1.weight loaded from bert.encoder.blocks.8.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:04:57,034.034 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.bias loaded from bert.encoder.blocks.8.mlp.fc2.bias of shape (768,) 2022-03-17 14:04:57,034.034 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.mlp.fc2.weight loaded from bert.encoder.blocks.8.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:04:57,034.034 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.bias loaded from bert.encoder.blocks.8.norm1.bias of shape (768,) 2022-03-17 14:04:57,034.034 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm1.weight loaded from bert.encoder.blocks.8.norm1.weight of shape (768,) 2022-03-17 14:04:57,034.034 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.bias loaded from bert.encoder.blocks.8.norm2.bias of shape (768,) 2022-03-17 14:04:57,034.034 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.8.norm2.weight loaded from bert.encoder.blocks.8.norm2.weight of shape (768,) 2022-03-17 14:04:57,034.034 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.bias loaded from bert.encoder.blocks.9.attn.proj.bias of shape (768,) 2022-03-17 14:04:57,034.034 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.proj.weight loaded from bert.encoder.blocks.9.attn.proj.weight of shape (768, 768) 2022-03-17 14:04:57,034.034 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.bias loaded from bert.encoder.blocks.9.attn.qkv.bias of shape (2304,) 2022-03-17 14:04:57,034.034 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.attn.qkv.weight loaded from bert.encoder.blocks.9.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:04:57,034.034 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.bias loaded from bert.encoder.blocks.9.mlp.fc1.bias of shape (3072,) 2022-03-17 14:04:57,034.034 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc1.weight loaded from bert.encoder.blocks.9.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:04:57,035.035 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.bias loaded from bert.encoder.blocks.9.mlp.fc2.bias of shape (768,) 2022-03-17 14:04:57,035.035 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.mlp.fc2.weight loaded from bert.encoder.blocks.9.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:04:57,035.035 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.bias loaded from bert.encoder.blocks.9.norm1.bias of shape (768,) 2022-03-17 14:04:57,035.035 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm1.weight loaded from bert.encoder.blocks.9.norm1.weight of shape (768,) 2022-03-17 14:04:57,035.035 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.bias loaded from bert.encoder.blocks.9.norm2.bias of shape (768,) 2022-03-17 14:04:57,035.035 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.blocks.9.norm2.weight loaded from bert.encoder.blocks.9.norm2.weight of shape (768,) 2022-03-17 14:04:57,035.035 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.bias loaded from bert.encoder.tag_blocks.0.attn.proj.bias of shape (768,) 2022-03-17 14:04:57,035.035 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.proj.weight loaded from bert.encoder.tag_blocks.0.attn.proj.weight of shape (768, 768) 2022-03-17 14:04:57,035.035 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.bias loaded from bert.encoder.tag_blocks.0.attn.qkv.bias of shape (2304,) 2022-03-17 14:04:57,035.035 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.attn.qkv.weight loaded from bert.encoder.tag_blocks.0.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:04:57,035.035 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.bias loaded from bert.encoder.tag_blocks.0.mlp.fc1.bias of shape (3072,) 2022-03-17 14:04:57,035.035 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc1.weight loaded from bert.encoder.tag_blocks.0.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:04:57,035.035 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.bias loaded from bert.encoder.tag_blocks.0.mlp.fc2.bias of shape (768,) 2022-03-17 14:04:57,036.036 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.mlp.fc2.weight loaded from bert.encoder.tag_blocks.0.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:04:57,036.036 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.bias loaded from bert.encoder.tag_blocks.0.norm1.bias of shape (768,) 2022-03-17 14:04:57,036.036 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm1.weight loaded from bert.encoder.tag_blocks.0.norm1.weight of shape (768,) 2022-03-17 14:04:57,036.036 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.bias loaded from bert.encoder.tag_blocks.0.norm2.bias of shape (768,) 2022-03-17 14:04:57,036.036 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.0.norm2.weight loaded from bert.encoder.tag_blocks.0.norm2.weight of shape (768,) 2022-03-17 14:04:57,036.036 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.bias loaded from bert.encoder.tag_blocks.1.attn.proj.bias of shape (768,) 2022-03-17 14:04:57,036.036 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.proj.weight loaded from bert.encoder.tag_blocks.1.attn.proj.weight of shape (768, 768) 2022-03-17 14:04:57,036.036 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.bias loaded from bert.encoder.tag_blocks.1.attn.qkv.bias of shape (2304,) 2022-03-17 14:04:57,036.036 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.attn.qkv.weight loaded from bert.encoder.tag_blocks.1.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:04:57,036.036 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.bias loaded from bert.encoder.tag_blocks.1.mlp.fc1.bias of shape (3072,) 2022-03-17 14:04:57,036.036 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc1.weight loaded from bert.encoder.tag_blocks.1.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:04:57,036.036 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.bias loaded from bert.encoder.tag_blocks.1.mlp.fc2.bias of shape (768,) 2022-03-17 14:04:57,036.036 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.mlp.fc2.weight loaded from bert.encoder.tag_blocks.1.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:04:57,037.037 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.bias loaded from bert.encoder.tag_blocks.1.norm1.bias of shape (768,) 2022-03-17 14:04:57,037.037 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm1.weight loaded from bert.encoder.tag_blocks.1.norm1.weight of shape (768,) 2022-03-17 14:04:57,037.037 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.bias loaded from bert.encoder.tag_blocks.1.norm2.bias of shape (768,) 2022-03-17 14:04:57,037.037 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.1.norm2.weight loaded from bert.encoder.tag_blocks.1.norm2.weight of shape (768,) 2022-03-17 14:04:57,037.037 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.bias loaded from bert.encoder.tag_blocks.2.attn.proj.bias of shape (768,) 2022-03-17 14:04:57,037.037 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.proj.weight loaded from bert.encoder.tag_blocks.2.attn.proj.weight of shape (768, 768) 2022-03-17 14:04:57,037.037 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.bias loaded from bert.encoder.tag_blocks.2.attn.qkv.bias of shape (2304,) 2022-03-17 14:04:57,037.037 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.attn.qkv.weight loaded from bert.encoder.tag_blocks.2.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:04:57,037.037 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.bias loaded from bert.encoder.tag_blocks.2.mlp.fc1.bias of shape (3072,) 2022-03-17 14:04:57,037.037 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc1.weight loaded from bert.encoder.tag_blocks.2.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:04:57,037.037 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.bias loaded from bert.encoder.tag_blocks.2.mlp.fc2.bias of shape (768,) 2022-03-17 14:04:57,037.037 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.mlp.fc2.weight loaded from bert.encoder.tag_blocks.2.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:04:57,038.038 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.bias loaded from bert.encoder.tag_blocks.2.norm1.bias of shape (768,) 2022-03-17 14:04:57,038.038 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm1.weight loaded from bert.encoder.tag_blocks.2.norm1.weight of shape (768,) 2022-03-17 14:04:57,038.038 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.bias loaded from bert.encoder.tag_blocks.2.norm2.bias of shape (768,) 2022-03-17 14:04:57,038.038 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.2.norm2.weight loaded from bert.encoder.tag_blocks.2.norm2.weight of shape (768,) 2022-03-17 14:04:57,038.038 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.bias loaded from bert.encoder.tag_blocks.3.attn.proj.bias of shape (768,) 2022-03-17 14:04:57,038.038 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.proj.weight loaded from bert.encoder.tag_blocks.3.attn.proj.weight of shape (768, 768) 2022-03-17 14:04:57,038.038 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.bias loaded from bert.encoder.tag_blocks.3.attn.qkv.bias of shape (2304,) 2022-03-17 14:04:57,038.038 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.attn.qkv.weight loaded from bert.encoder.tag_blocks.3.attn.qkv.weight of shape (2304, 768) 2022-03-17 14:04:57,038.038 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.bias loaded from bert.encoder.tag_blocks.3.mlp.fc1.bias of shape (3072,) 2022-03-17 14:04:57,038.038 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc1.weight loaded from bert.encoder.tag_blocks.3.mlp.fc1.weight of shape (3072, 768) 2022-03-17 14:04:57,038.038 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.bias loaded from bert.encoder.tag_blocks.3.mlp.fc2.bias of shape (768,) 2022-03-17 14:04:57,038.038 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.mlp.fc2.weight loaded from bert.encoder.tag_blocks.3.mlp.fc2.weight of shape (768, 3072) 2022-03-17 14:04:57,038.038 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.bias loaded from bert.encoder.tag_blocks.3.norm1.bias of shape (768,) 2022-03-17 14:04:57,038.038 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm1.weight loaded from bert.encoder.tag_blocks.3.norm1.weight of shape (768,) 2022-03-17 14:04:57,039.039 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.bias loaded from bert.encoder.tag_blocks.3.norm2.bias of shape (768,) 2022-03-17 14:04:57,039.039 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.encoder.tag_blocks.3.norm2.weight loaded from bert.encoder.tag_blocks.3.norm2.weight of shape (768,) 2022-03-17 14:04:57,039.039 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.bias loaded from bert.extra_embeddings.LayerNorm.bias of shape (768,) 2022-03-17 14:04:57,039.039 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.LayerNorm.weight loaded from bert.extra_embeddings.LayerNorm.weight of shape (768,) 2022-03-17 14:04:57,039.039 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.position_embeddings.weight loaded from bert.extra_embeddings.position_embeddings.weight of shape (512, 768) 2022-03-17 14:04:57,039.039 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.token_type_embeddings.weight loaded from bert.extra_embeddings.token_type_embeddings.weight of shape (2, 768) 2022-03-17 14:04:57,039.039 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.extra_embeddings.word_embeddings.weight loaded from bert.extra_embeddings.word_embeddings.weight of shape (30522, 768) 2022-03-17 14:04:57,039.039 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.bias loaded from bert.pooler.dense.bias of shape (768,) 2022-03-17 14:04:57,039.039 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.pooler.dense.weight loaded from bert.pooler.dense.weight of shape (768, 768) 2022-03-17 14:04:57,039.039 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.bias loaded from bert.tag_logit.predictions.bias of shape (30522,) 2022-03-17 14:04:57,039.039 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.decoder.weight loaded from bert.tag_logit.predictions.decoder.weight of shape (30522, 768) 2022-03-17 14:04:57,039.039 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.bias loaded from bert.tag_logit.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-17 14:04:57,039.039 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.LayerNorm.weight loaded from bert.tag_logit.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-17 14:04:57,040.040 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.bias loaded from bert.tag_logit.predictions.transform.dense.bias of shape (768,) 2022-03-17 14:04:57,040.040 2829:checkpoint.py:99 align_and_update_state_dicts(): module.bert.tag_logit.predictions.transform.dense.weight loaded from bert.tag_logit.predictions.transform.dense.weight of shape (768, 768) 2022-03-17 14:04:57,040.040 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.bias loaded from cls.predictions.bias of shape (30522,) 2022-03-17 14:04:57,040.040 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.decoder.weight loaded from cls.predictions.decoder.weight of shape (30522, 768) 2022-03-17 14:04:57,040.040 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.bias loaded from cls.predictions.transform.LayerNorm.bias of shape (768,) 2022-03-17 14:04:57,040.040 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.LayerNorm.weight loaded from cls.predictions.transform.LayerNorm.weight of shape (768,) 2022-03-17 14:04:57,040.040 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.bias loaded from cls.predictions.transform.dense.bias of shape (768,) 2022-03-17 14:04:57,040.040 2829:checkpoint.py:99 align_and_update_state_dicts(): module.cls.predictions.transform.dense.weight loaded from cls.predictions.transform.dense.weight of shape (768, 768) 2022-03-17 14:04:57,040.040 2829:checkpoint.py:104 align_and_update_state_dicts(): target model param = 288; name matched = 288; loaded = 288 2022-03-17 14:04:57,040.040 2829:checkpoint.py:107 align_and_update_state_dicts(): from loaded; ignore = [] 2022-03-17 14:04:57,044.044 2829:torch_common.py:1069 load_model_state_ignore_mismatch(): unique keys in init dict = ['module.cls.predictions.decoder.weight']; total = 1 2022-03-17 14:04:57,205.205 2829:torch_common.py:1074 load_model_state_ignore_mismatch(): unique key (not initialized) in current model = ['module.cls.predictions.decoder.weight'] 2022-03-17 14:04:57,525.525 2829:qd_common.py:3452 print_frame_info(): func name = __init__; self = ; data = TaxCocoCaption; split = test; add_key = False; backend = cv; hold_buffer = 0; save_original = False 2022-03-17 14:04:57,553.553 2829:uni_pipeline.py:509 get_data_loader(): sampler = 2022-03-17 14:04:57,553.553 2829:uni_pipeline.py:1000 predict(): writing output/Logit_Vilt_captioning_testing_batch-size_512_encoder_vit_base_patch16_384_lr_1e-4_iter_60_vitbfocal20_bert_tokenizer_tags_ENC-DEC_multiplier_0.1_expand_tag-classifier_emb/snapshot/model_iter_0065000.pt.TaxCocoCaption.test.crop384.predict.tsv_0_32.tsv 2022-03-17 14:04:57,627.627 2829:uni_pipeline.py:907 predict_iter(): DatasetPlusTransform(dataset=, transform=Compose( Compose( ImageTransform2Dict(image_transform=Compose( ToPILImage() Resize(size=384, interpolation=PIL.Image.BICUBIC) CenterCrop(size=(384, 384)) ToTensor() Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) )) ) LoadLabel(data=TaxCocoCaption, split=test, version=vinvl) TransCaptionTensorizer(tensorizer=, pad_to_max=True, pad_image_to_max=True) )) uni_pipeline.py:908: 0%| | 0/10 [00:00