alerton
Petz Petter
Posts: 8
Petz Versions: 5
|
Post by alerton on Mar 14, 2024 12:47:27 GMT -5
Hi guys! This isn't specifically a hexing question, but I felt like it was the most appropriate place on the forum to post. I've been looking at the PET Files just out of curiosity, but I noticed today that two cats I adopted today have two completely different file languages when I opened them up in the notepad app on my windows 11 machine. I adopted an Egyptian Mau named Stinkied and a Russian Blue named Mackerel. imgur.com/B0BdNNYimgur.com/7tLOGNUMackerel's PET file is in ANSI with double the amount of characters, and Stinkied's file is in UTF-16 LE. Does anyone know what this means and why they are different? (edit: Stinkied is a Desrt Lynx, not an Egyptian Mau, sorry about that!) (another edit: these are from the Petz 5 game)(if that makes a difference)
|
|
|
Post by Kieran @ Fractal on Mar 14, 2024 14:38:33 GMT -5
The only thing I can think of is that Maus were made by Ubisoft after they acquired the brand. Interesting observation!
|
|
alerton
Petz Petter
Posts: 8
Petz Versions: 5
|
Post by alerton on Mar 15, 2024 12:45:22 GMT -5
The only thing I can think of is that Maus were made by Ubisoft after they acquired the brand. Interesting observation! Interesting! I looked into this (https://petzcommunity.fandom.com/wiki/Original_Breeds) and found out that Ubisoft added 5 new cat and dog breeds once they took over, and indeed Mau's are one of the new "ubi breeds". So to look into this further I adopted an alley cat named Shrimpy, which is an original breed, and an egyptian mau named Eggy (I just now realized that in my original post I said Stinkied was a mau but it's a desert lynx, which still makes your theory hold as the lynx is one of the new "ubi breeds"! Sorry about that). imgur.com/a/eAtJtWoimgur.com/Iha9YJtAs you can see, the PET file for Shrimpy is in UTF-16 LE. Eggy's is in ANSI. So they're the opposite of the PET files I posted screenshots of in the original post. I'm even more confused now!
I thought that from your ubisoft observation Kieran, that Eggy would be in UTF-16 LE, and Shrimpy would be ANSI. I also adopted another Desert Lynx just to double check and it's still in UTF-16 LE.
I think I'm going to have to adopt a pet of every single cat (and dog) breed and see if there is any rhyme or reason behind the different coding languages used in their PET files.
I don't know if this is interesting to anyone else haha, I'm just really curious about the difference in PET files and why they're like this? Oh and this is all for Petz 5 btw.
|
|
|
Post by Kieran @ Fractal on Mar 15, 2024 13:39:41 GMT -5
Nah I think it's pretty interesting! I certainly would never have noticed it. Keep us all posted, cuz I'm out of ideas haha
|
|
alerton
Petz Petter
Posts: 8
Petz Versions: 5
|
Post by alerton on Mar 16, 2024 5:03:04 GMT -5
Ok, so I adopted a pet of every single cat and dog breed. I wrote down every single breed in my notebook for an easy overview (along with F for female and M for male in case it made a difference, which didn't seem to make a difference when I changed my female Mau to a male Mau and the code language didn't change). I also put stars next to the breeds that are made by Ubisoft, the so called "ubi breeds". imgur.com/zhylknCBy looking at this, there's differences from my other posts. In my original post my Lynx's PET file was in UTF-16 LE, but now when I looked today it was in ANSI! The only thing I've changed is my Lynx's name, which I simply just changed to Lynx to make looking at the files easier, and in the profile of the cat I wrote her name so that I wouldn't forget it when I change her name back. So a bit confused about why that change has occurred? And it's the same with my Egyptian Mau, it's now in UTF-16 LE instead of ANSI like in my previous post. And like my Lynx I just did the name change thing for ease. I'm losing my mind looking at these files like why have they changed!! The patterns I'm noticing today though is that when it comes to the "ubi breeds" at least, there seems to be 3 cat breeds that are in ANSI, and then 2 in UTF-16 LE. And it's the same for the dog breeds, 3 ANSI, 2 UTF-16 LE. And when it comes to the original cat and dog breeds it's equally split between ANSI and UTF-16 LE. So for the original cats there's 5 ANSI, 5 UTF-16 LE. And for the original dogs it's 5 ANSI, 5 UTF-16 LE. I don't know what any of this means hahaha
|
|
|
Post by Reflet on Mar 16, 2024 15:50:06 GMT -5
Hi, .pet file specialist here It looks like you're opening the files in Notepad, which is meant for reading text files, but .pet files are binary files. You can read about the difference here! tldr: "Binary files have no inherent constraints (can be any sequence of bytes), and must be opened in an appropriate program that knows the specific file format (such as Media Player, Photoshop, Office, etc.). Text files must represent reasonable text, and can be edited in any text editor program."In order to support different character sets around the world, we humans have come up with different encoding schemes to represent them. So, when you open a binary file in Notepad, Notepad is like "Okay, surely this is text data, but it looks really weird. I thiiiink this is the right encoding to use...? Hopefully that's good enough." Example of different encoding: UTF-8 UTF-16 Notice how the UTF-16 file uses two bytes per character, while the UTF-8 file only uses one byte per character. The distinction between LE (little-endian) and BE (big-endian) on UTF-16 is what order to put the bytes in a two-byte sequence. If you're curious about the .pet file structure, I have a breakdown here (and I'm working on an updated version with more info, better formatting, fixed errors, etc).
|
|
alerton
Petz Petter
Posts: 8
Petz Versions: 5
|
Post by alerton on Mar 18, 2024 7:34:52 GMT -5
Hi, .pet file specialist here It looks like you're opening the files in Notepad, which is meant for reading text files, but .pet files are binary files. You can read about the difference here! tldr: "Binary files have no inherent constraints (can be any sequence of bytes), and must be opened in an appropriate program that knows the specific file format (such as Media Player, Photoshop, Office, etc.). Text files must represent reasonable text, and can be edited in any text editor program."In order to support different character sets around the world, we humans have come up with different encoding schemes to represent them. So, when you open a binary file in Notepad, Notepad is like "Okay, surely this is text data, but it looks really weird. I thiiiink this is the right encoding to use...? Hopefully that's good enough." Example of different encoding: UTF-8 UTF-16 Notice how the UTF-16 file uses two bytes per character, while the UTF-8 file only uses one byte per character. The distinction between LE (little-endian) and BE (big-endian) on UTF-16 is what order to put the bytes in a two-byte sequence. If you're curious about the .pet file structure, I have a breakdown here (and I'm working on an updated version with more info, better formatting, fixed errors, etc). Hey, thanks for the in depth answer and the links! Will definitely check them out. Do you know why though, when I open the PET files in Notepad, they are shown in ANSI and UTF-16 LE specifically? I'm just curious as to why they show up as those two, and that there seems to be a pattern in the cat and dog breeds when it comes to the amount of files that are in the two coding languages!
|
|
|
Post by Reflet on Mar 19, 2024 14:53:30 GMT -5
Hey, thanks for the in depth answer and the links! Will definitely check them out. Do you know why though, when I open the PET files in Notepad, they are shown in ANSI and UTF-16 LE specifically? I'm just curious as to why they show up as those two, and that there seems to be a pattern in the cat and dog breeds when it comes to the amount of files that are in the two coding languages! The Win32 function Notepad uses to figure out the encoding of a given file is IsTextUnicode, which "uses various statistical and deterministic methods to make its determination". According to the documentation: I looked with my disassembler and Notepad appears to be analyzing the first 1024 bytes of the .pet file, which contains all of the LoadInfo, as well as the beginning of the PetzInfo up to the "first byte of the Action ID for the Chicken flavor's Down gesture's first trick slot" (see my breakdown for what fields are contained, if you're curious). ANSI supports 256 characters, meaning every byte could technically be interpreted as ANSI (whether or not it should is a different question), so that's probably the default. However, due to slight variations in the data for an individual pet (their name, breed name, breedfile path, session ID, instance ID, seed, species number, load flags, timestamps, neglect/runaway thresholds, checksum, two unused bytes that may or may not contain random garbage data from RAM, and a portion of the trick data), it may pass the statistical test for UTF-16. After all, only a fraction of those 1024 bytes are supposed to be interpreted as text data, and most of those bytes are null (0) because the fixed-length name/breed/breedfile text buffers are, under normal circumstances, largely empty (256 characters for name/breed, 260 for breedfile -- but it doesn't take anywhere close to that much to store your average name/breed/breedfile data).
|
|
alerton
Petz Petter
Posts: 8
Petz Versions: 5
|
Post by alerton on Apr 23, 2024 12:22:47 GMT -5
thank you so much for your replies Reflet! I appreciate it a lot :-)
|
|