Managing contacts on a system that has 384KB of RAM is not as simple as it first sounds. This is compounded by the limited support for direct FAT file system access in the Hypervisor.
My current plan is to use a D81 disk image to store each contact in a single 512 byte sector. As the disk images are 8,192,000 bytes = 800KiB, this means we can have 1,600 contacts -- less a bunch for track 40 that we will keep reserved for a valid 1581 directory listing, that will basically identify it as a MEGAphone contacts disk image. That will leave 790KiB = room for 1,580 contacts.
To keep the contacts file a valid 1581 file (as well as supporting direct sector access), the first two bytes of each contact sector will be a standard link to the next sector. Actually, each 512 byte sector has to have two of these sector pointers, one at offset 0 and the other at offset 256, because the 1581 uses 256 byte logical sectors for commonality with the 1541/1571. That leaves us with 508 bytes per contact.
Now, what exactly we do with all 508 bytes is up for grabs. But what we can do, is decide on how to store the name and primary telephone number of the contact, since those are the two most important fields. Things like an avatar we'll leave for now as out of scope, as while desirable, they are not essential for basic operation.
For flexibility, we'll use a single byte to indicate the field type, followed by a length field. We'll use the LSB of the field type as MSB of the length field, to allow for fields to consume all 508 bytes.
0x00 - End of contact record
0x02 - Contact Name
0x04 - Contact Second Name (surname)
0x06 - Primary number
0x08 - Disk image name for SMS conversation file
0x0A - Number of unread SMS messages (length = 2, to allow for upto 65,535) unread messages.
The contact disk image should be sorted by telephone number for quickly finding a contact based on incoming telephone number (for handling SMS arrivals).
A secondary contact disk image should be maintained as a subordinate copy, re-sorted by name. And a 3rd copy sorted by second name. Whenever a contact is modified in the primary contact file, these subordinate contact indices should be updated. Similarly whenever a contact is added or deleted from the primary contact list.
This implies that we need a sort function for a contacts disk image, that sorts based on the supplied field type. As the MEGAphone won't have the 8MB HyperRAM in the initial build, we have to be able to do this sort in chip RAM. In practice, this means that we'll probably read 64KB of sectors in at a time, sort those, and then write them back, advance 32KB, and repeat. This will implement a kind of batched of bubble-sort. Multiple such passes will be required, until no further changes occur during a complete pass. We could do better in terms of big-O time cost, but this algorithm will be small and simple, and we can easily improve it down the track.
Now, we indicated having a disk image for each SMS conversation, We'll do a similar approach to the contact disk images, by storing the most recent 1,580 messages, each in their own 512 byte sector. The message format should indicate if it's an in-bound or out-bound message, the time and date of the message, followed by the message body. We'll use a similar type + length + data format:
0x00 - End of message record
0x02 - Message is in-bound (from contact) or outbound (1 byte length)
0x04 - Time and date of message (ASCII representation)
0x06 - Message body, UTF-8.
Now, for searching for contacts and messages within a conversation, we'll implement a horribly crude index, where we have a bitmap of 1,600 bits = 200 bytes for each paired combination of 56 characters, i.e., 56x56 = 3,136 256 byte sectors = 1,568 512 byte physical sectors, just fitting in the 1,580 sectors available in the D81. For the 56 characters, we will allow A-Z (case insensitive), 0-9, some punctuation, and German extra characters for now (ö, ä, ü and ß), given the initial market focuses will be English and German speaking. We'll also reserve one of those characters to mean "any other character," so that at least partial filtering will be possible, regardless of the language, provided it uses a reasonable fraction of latin letters. It's a far from perfect solution, but it will allow for fast searching, without using a lot of RAM or complex algorithms.
Basically when searching for a string like "potato", all we have to do is generate each consecutive character pairing, i.e., "po", "ot", "ta", "at", and "to", retrieve the bitmap for each and logically AND them together. Any 1 bits at the end mean that each of those two-character pairings are present in that message. This will likely be good enough, and result in a fairly low false-positive rate. It will also be tolerant to some spelling errors. We can slightly improve it by instead of ANDing, count the number of 1s in each message, and then return a search result list ordered by how good the match is. A bir crude, but I suspect it will be surprisingly effective.
We can use this kind of index on both the contact list, and one for each of the SMS message threads with each contact. That keeps code compact and simple.
When we receive or send an SMS, we need to look at the next available sector in the contact conversation (and possibly create a new contact if the number hasn't been seen before, I need to think about that), and put the message there, with the in-bound or out-bound marker set. Then if it's an inbound message, we need to update the unseen message count in the contact list, and then update that in the contact index disk images as well (ideally with a targeted update, rather than requiring a full re-sort that would take several seconds, probably.)
We should probably also have somewhere in the contacts disk image where we keep track of the whole unseen message count status, for quickly determining what we should show on the display to report unseen messages. This could be kept in some fake directory entries, which would also make it easy to query from 3rd-party programs.
If the thread for a contact fills, then we should either automatically rotate the conversation through, aging out old messages, create a 2nd disk image for the contact, prompt the user or otherwise. We should also keep track of what the most recent message position in the disk image is. This could be done in a fake directory entry.
Okay, that all sounds a bit complex, but the reality is that it's about as simple as we can make a functional SMS and contact handling system.
It will let us make code to ask for the most recent n messages, which will make it easier for us to update a scrolling display of messages, without having to keep them all in RAM at the same time. That will need to work backwards, since by default we want to show the most recent message at the bottom, and be able to scroll backwards in time, without having to get all crazy-pants with algorithms. It might be that we allow scrolling only message-by-message to keep this simple. Probably okay to begin with.
This means our SMS message chain renderer also needs to start from the bottom of the screen and work its way backwards up the display. Again, not too hard to do. It's just a matter of implementing it.
Now, for managing piles of contacts with out disk searching going too slow, I'll probably use a directory structure, where we have PHONE/THREADS/00-FF/00-FF/ allowing for 64K contacts, each in their own directory, with one or more D81s holding conversations, and whatever else we decide to put there (like avatar images). Then we'd also have PHONE/CONTACT0.D81, CONTACTS/IDX02-0.D81 (for sorted by first name index) and PHONE/IDX04-0.D81 (for sorted by second name index).
Now, one unsolved issue is messages received from numbers that are not yet contacts. We could setup a new contact for each automatically, but as we have index issues beyond 1,580 contacts (although we could have CONTACT0, CONTACT1 etc D81s, to allow for growing beyond 1,580 contacts), it probably makes sense to just put all messages/calls from unknown senders into a common thread for now, with the sending number indicated in a specific data field.
That also reminds me, we will be treating missed calls as a special kind of SMS message, likewise whenever you call someone, it will be logged as a virtual outbound SMS.
Corrupt indices we will deal with by just rebuilding them.
Corrupt contacts we will just live with. On SD card they shouldn't happen, but if they do, it should only mess up fields in the record, and we can just let the user edit it, and make it correct when written back.
Deleting contacts or messages just leaves an empty record in the disk image, which the sort function will clean up.
Now, we also need to store the current state of the telephony interface for when we get interrupted by an incoming call, incoming SMS or other event that requires us to switch which part of the telephony software is running (we are really breaking the software down into a bunch of smaller programs, so that each can fit in 64KB using the CC65 compiler (which really means about 40KB of code, which with CC65 is not necessarily that much). On returning from a helper program, we should be able to restore the state of what we were doing, e.g., showing any active call status, ringer status etc, restoring a message from draft status (we should store those in a reserved sector of the contact thread if it's a message to a contact, but that won't work if the recipient isn't a contact). There are probably other things, too.
To manage this, except for draft messages to contacts which we can put in a reserved sector in the thread D81 for each contact, we should probably just have a STATE.D81 where we progressively allocate sectors for each bit of state. Those sectors should be chained into the directory listing with appropriately named files, so that they can be easily modified by 3rd party programs -- so long as their track and sectors don't change on the disk, as we want to bypass having to use the CBDOS file system code (partly so that we can reclaim the 128KB of chip RAM where the "ROM" usually sits). We'll mark the files as locked, so that they are nominally read-only, and include a warning in the directory entry that files must retain their track and sector numbers, and should only be read by 3rd party software if using the file system, instead of direct sector access.
When messages are received from the cellular modem (e.g., incoming call, incoming SMS, call terminated etc), then these should be written verbatim into a message buffer in STATE.D81 (to keep common cellular modem monitoring code simple), state should be stashed and then processed by the appropriate helper program. Now, there is an issue here where a denial of service attack could be mounted against a user, by having people continually ring and/or text them, so that the thing is endlessly occupied saving and restoring state and switching programs. To mitigate this, we should probably process RING messages from within each program, and, ideally, messages from the same contact that is currently being displayed, too. But everything else can be batched up and processed periodically.
Or, we could have the program that does all this processing persistently held in RAM (or stored in the flash filesystem, which is also super-fast!), and then basically bank switched (or DMA swapped, or equivalent) into place, so that the latency and duration of activity is greatly reduced. There are limits to this, though, because whenever we have to switch mounted disk images, there will be a considerable delay, as the hypervisor has to do FAT file system searches. So we really want to avoid doing that too often. It might be that we limit ourselves to doing a process of SMS messages addressed to other contacts/numbers to once every 10 seconds or so, or whenever the unprocessed message queue reaches 50% full, to give us time to drain it, without losing messages.
We'll reserve a healthy slab of STATE.D81 for unprocessed messages, perhaps 128KB, which would be enough for 256 unprocessed messages.
And I think that covers the whole back-of-house stuff that we need for tracking SMS and phone calls.
So let's try to turn this into a set of requirements (which expand on the requirements we already have documented in the repository, and are subordinate to them):
Requirement: Linux program that can create the various core D81 files (e.g., STATE.D81).
Requirement: Linux program that sets up the directory structure on the SD card for the MEGAphone (PHONE/*).
Requirement: Code in dialer to save/restore state in STATE.D81
Requirement: Code in each UI program to save/restore state in STATE.D81 and any relevant contact thread.
Requirement: Code in each UI program to trigger a switch to the message queue processor program after a given timeout and/or the unprocessed queue has reached the 50% full level.
Requirement: Function to create a contact (including creating the D81 file(s) required).
Requirement: Function to modify a contact.
Requirement: Copy contact disk to index disk.
Requirement: Sort index disk by specified key.
Requirement: Create or open thread disk image.
Requirement: Insert message into thread disk image.
Requirement: Retrieve message from thread.
Requirement: Update unread message count for contact.
Requirement: Update unread message status for phone.
Requirement: Log non-contact calls/SMS as pseudo-contact "unknown"
Requirement: Display an SMS / call log message (either in-bound or outbound, using appropriate formatting to differentiate)
Requirement: Display a screen full of SMS / call log messages.
Requirement: Text entry for composing an SMS.
Requirement: Dialer entry for telephone number.
Requirement: Contact list. Select to open message list.
Requirement: Exit message list back to contact list.
Desirable: Allow inserting unicode/emoji in contact fields.
Desirable: Delete old messages from thread.
Desirable: Unicode/Emoji entry for SMS composition.
Desirable: Contact list search function by typing a fragment of a name or number.
Desirable: Message thread search function by typing a fragment of a message.
Desirable: Highlight sections of messages that match search query.
Desirable: Update search index based on a single contact or message.
Desirable: Rebuild search index for all contacts/messages.
Desirable: Associate avatar/emoji with contacts.
Desirable: Display contact avatar/emoji on incoming call.
Desirable: Dispaly contact avatar/emoji during call.
Desirable: Allow SMS composition during calls.
Desirable: Allow pasting location into SMS message (including during call).
Desirable: Allow showing last known direction and distance to contact based on last shared location (including during a call).
That's all probably enough now, that I can start digging in to implement it all.
The above requirements specification also addresses the following milestones in the NLnet funded project:
Milestone 1.4 Telephony software User-Interface: Requirements Specification
Milestone 1.6 Text Messaging software: User-Interface: Requirements Specification
No comments:
Post a Comment