Sunday 28 June 2020

Ultrasonic communications for the MEGAphone: Testing speaker and microphone performance

This might sound like a rather strange thing to be working on, but there is a reason.  The NLnet Foundation has taken an interest in the MEGAphone as a secure and "sovereign" communications device, that can play a role in civil society.  This largely aligns with what I have previously said about the need for such self-sovereign communications systems in the face of the coming digital winter.  

In short, NLnet have kindly agreed to fund a body of work on advancing the MEGAphone, and by implication, advancing the MEGA65 project. This means that I will be spending a considerable amount of time in several pieces between now and early next year working on the MEGAphone.  This means I am going to be spending more time working on the MEGA65 as a whole than I otherwise would have, so it's a positive all round.  But I understand that some folks are only interested in the MEGA65 as a retro-computing platform.  In which case, feel free to pay less attention to these posts. That said, they will still be very much focused on fun retro development, and solving very much real world problems with such a system. After all, its not every day that someone makes an 8-bit smart-phone that can communicate via ultra-sound!

If borders open, and international travel becomes feasible again, this will be done in  Darmstadt with the rest of the MEGA65 team, but until then, I'll continue to work from here, since there isn't really any alternative.
Speaking of the effects of COVID19, this has also slightly tweaked the work plan, as NLnet are keen for us to look at how the idea of a sovereign and fully open phone can be used to produce a more privacy-protecting and less error-prone form of contact tracing. In particular, they are interested to know how practical implementing near ultra-sound communications would be, to see if it makes sense as an alternative to bluetooth-based proximity detection. Ultrasound has some nice theoretical advantages, like not working through walls or other barriers that are likely to also be effective barriers against virus transmission. Thus it has the potential to reduce the false-positive rate.

Exploring ultra-sonic communications is something that I have been wanting to do for a while, and was already on the plans fora the MEGAphone as a means of resilient communications.  For those who remember when I first talked about the MEGAphone, they might recall that it already has a bunch of weird communications modes available, including an IR LED that can probably turn TVs off from 200m away.   It also has not just one microphone, but an array of 4.

The microphone array was intended for making it easier to to cancel background noise, and detect the direction a sound is coming from, so that speaker-phone mode could do various party tricks with it. However, having read about ultrasonic communications around that time, I chose microphones that are sensitive well into the ultrasound range.  So, its quite possible that we may be able to achieve ultrasonic communications with the MEGAphone.  And that is the first milestone in the NLnet project:

1. Assess near ultrasound capability of existing MEGAphone prototypes.
There already exists several revision 1 hardware prototypes of the MEGAphone, that form the basis for forward activity on the project. These include MEMS microphones and amplified speaker output functionalities that have the potential for near ultrasound communications. The purpose of this sub-task is to assess the feasibility of this, and gain an understanding of what may be possible. It is entirely possible that this will prove infeasible with the current hardware, in which case the reasons and potential remedies for this will be considered for inclusion in a future revision of the MEGAphone hardware.
Milestone(s)
  • Examine the existing components of the MEGAphone r1 and r1b prototype hardware, in particular the MEMS microphones and speakers, to determine their theoretical suitability for near ultra-sound communications.
  • Determine the ultrasonic frequencies and bandwidth that are likely to be possible to use, and consider the constraints that this is likely to place on any protocol designed to use this facility.
So my first goal here is to look at the speakers and microphones in the MEGA65, and, and see what their theoretical properties are, and whether they have the potential to be used for ultrasonic communications, and if so, what frequency bands we expect to be able to use. Of course, we could also consider dedicated ultra-sound communications, but the goal here is to use existing speakers and microphones, rather than increasing the bill of materials of a phone.  So let's get started:
`

Microphones

The MEGAphone R1 prototypes have four MEMS microphones. As physically very small structures, they have a natural resonance that is well into the ultrasonic frequencies. The ones that we are using are the SPW0690LM4H.  According to Table 2 of that datasheet, they have a Resonant Frequency Peak at 26KHz.  Also, Table 2 tells us that at 15KHz they are 3dB more sensitive than at 1KHz.  Thus we can expect that they will likely be competent up to probably around 40KHz or so. Figure 9 provides some more information, showing the frequency response:



Here we see a peak at 26kHz as promised. We also see that sensitivity is quite reasonable all the way to the 80kHz limit they show, and in fact from ~72kHz the sensitive is above the sensitivity of the acoustic range.  So while somewhere around 26kHz would be ideal, with a benefit of close to +20dB versus the acoustic band, any ultrasonic frequency up to at least 80kHz should be usable.  So the receive side is not likely to be a problem.

Speaker

The speaker in the MEGAphone is a CMS-40504N-L152.  As this is not on the PCB, we can in principle easily swap it out for another.  However, hopefully we won't need to do that, because these speakers are simply fantastic for the MEGAphone. They are super-loud and take up to 2W for loud ringing and playing games, and have good frequency response thanks to their relatively large 40mm diameter. And for all that, they are only ~5mm thick.

Here we have a less positive prospect: It claims a maximum frequency of 7kHz.  This will presumably be the maximum frequency with reasonably flat response, above which it waill presumably roll off.  Digging through the datasheet we find:

Well, this is much better than feared.  Yes, above 7kHz, it is 10dB below the lower frequencies. But it isn't a flat roll-off. Rather there are a couple of interesting peaks, the first at ~9kHz, and then a second at what looks to be ~18kHz, after which there is a similar big drop-off.  Unfortunately, they don't show any higher frequencies on the table. This is a bit unfortunately, as 18kHz is still audable for some younger people (I can still just hear 17.4kHz, and maybe a bit higher, and I am in my 40s).

But I am just about willing to bet that there is another peak at ~3x9kHz = ~27kHz, which would nicely coincide with our microphone's peak sensitivity. If possible, that would be a particularly effective combination. The question is whether the speaker is still loud enough, and whether we would start to get audible distortion.  This will require some experimentation.

But to summarise: 18kHz should be fine, and if we can confirm it, ~27kHz has potential.  27kHz also has the advantage that it is safer at high volume levels than frequencies below 20kHz. There is still the residual problem that some animals are quite sensitive even around 30kHz, cats in particular (although it looks like tuna fish won't be bothered in the least).

Either way, the frequency response around these frequencies is a bit sharp, so the bandwidth available is likely to be quite narrow -- perhaps less than 1kHz.  This will affect what wave-forms we might try to use for a communications protocol.  For example, frequency shift keying might be problematic, because allowing for variation of peak frequency resonance among the speakers and a bit of Doppler shift etc. Similarly chirped spread spectrum would be problematic, due to the very narrow frequency band available for the chirps.
 
Although for contact tracing, we can probably assume that if you are moving past someone fast enough to cause a large Doppler shift, you are probably unlikely to catch anything from them -- unless perhaps you are on a fast merry-go-round near a stationary person for a long time.  Thus we can probably ignore the problem of large Doppler shifts due to high velocities.

Would other speakers be any better?

There are certainly other speakers that have higher rated frequencies, and thus might be capable of higher frequencies.  We'll examine that a bit later, if it turns out to be necessary. This is because we need only a few metres range, and if we can achieve this with the existing speaker that is optimised for our other needs, then there is no point changing it.  To determine that, we need to consider out total link budget, similar to if we were using radio.

Link-Budget Estimate

Lets now think about what the link budget at 18kHz and 27kHz would be. We will start by examining the microphone sensitivity and maximum speaker volume, and then subtract for the free-space path lost as the ultrasound disperses through the air.

According to the speaker data-sheet, we should be able to produce a signal of ~94dB SPL (sound pressure level).

The free-space path loss through air is relatively minor for the frequencies and distances that we are concerned about. According to this calculator, the path loss will be <1dB per metre. However, the total path lost due to the inverse square law etc, is more like 20dB over 1m, ~34dB over 5m and ~41dB at 10m

Finally, we have the sensitivity of the MEMS microphone to consider. Here I am having to stretch my understanding of the data-sheet -- so if anyone reading this spots any errors in my reasoning, please let me know, so that I can fix it. 

The MEMS microphones indicate a signal-to-noise ratio of 82dB for near-ultrasound frequencies (around 20kHz) with a 94dB SPL (sound pressure level) sound source. We know that this should improve as we approach the resonance at 26kHz, so we will assume that we can detect down to 94 - 82 = 12dB.

Pulling this together, at a range of 10 metres, we should have a link budget of 94dB SPL - 41dB - 12dB = 41dB.  If we allow 20 or 30 dB for multi-path interference and all the usual horrors of propagation, we should have a signal with an amplitude of 10 dB to 20 dB, i.e, 10x to 100x the background noise level.  In reality, this might be somewhat worse due to interference from ambient noise sources, and muffling effects, such as having the device in your pocket or bag.  That said, if we are aiming for 1.5m instead of 10m, there is better than 10dB margin to be gained.  Also, all of this assumes an omni-directional sound source, which will not be true -- but it is a starting point.

However, overall, it seems like we should have a positive link budget at the end, with an SNR of somewhere around 10dB to 20dB.  So for now, we don't need to immediately try a different speaker. 

Constraints on Protocol Design

So it sounds like we should have a signal of a few hundred Hz bandwidth, and with a link budget of 10 -- 20 dB.  Within this constraint, there is still considerable freedom for selecting a solution.  Optimising that would require considerable effort and expertise beyond what we possess, or what is in this case actually required: This is because contract tracing requires only very modest data transfer rates.  Each beacon can potentially be only a dozen or so bytes long, and need only be sent every minute or so.  Given most guidelines are for 15 minutes of close proximity, this would allow significant redundancy in the beaconning, to help reduce the probability of false negatives. 

If we allow for 15% channel efficiency for a simple ALOHA approach, and up to 100 people within range of each other at a time, this means that each beacon must consume less than 0.15% of each time step. For a 1 minute beaconing interval, this corresponds to an air-time of 90 milliseconds.  Assuming we need a 32 baud synchronisatiaon preamble and a 64-bit random token, this would require a data rate of ~96 bits / 90 millisecond = ~1kbit / second.  This seems totally achievable within the expected channel characteristics.

If the token consisted of a 48-bit unique token and 16-bit CRC, this would provide for robustness in the protocol, while still maintaining a very low false-positive rate.  We can use the Birthday Paradox to compute the beacon collision rate with 2^48 tokens and not more than 2^32 simultaneous users of the system:  The probability of any two users using the same beacon at the same time would be ~1/(2^(16/2)) = ~1/256. Thus we would expect of the order of 10^1 colliding beacons per day, globally.  Assuming that less than 1% of users of the system test positive for the virus on any given day, this would result in ~10^(1-2) = ~1/10 false positive situations per day, i.e., of the order of one person per week being erroneously told that they had been in contact with someone with the virus.  This is almost certainly below the noise floor of the viral testing procedures and various other factors. Thus we accept this false positive rate.

As can be read below, a channel bandwidth of around 1KHz seems quite possible -- provided that the other technical barriers can be resolved.  Thus while careful protocol design and implementation would be necessary, the channel bandwidth would not impose any particularly troublesome constraints on the protocol design.

In fact, the very short range of communications that would be likely realised would result in rather more relaxed constraints than described above.  This could allow for a reduction in data rate and/or longer packets containing more information, which could be used to effectively eliminate the remaining false-positive rate.

Conclusion and Recommendation

For the full detail on how the following conclusions were reached, read the "Appendix: Experimental Verification" section below, but the TL;DR version:

1. It is possible to perform bi-direction near-ultrasonic communications using the existing hardware of the MEGAphone prototypes.
2. There is likely sufficient bandwidth to implement a realistic contact tracing facility.
HOWEVER
3. My suspicions about the real-world practicality of this led me to go beyond the scope of the milestone and actually test the components, which identified several important factors, including:
4. The audible artefacts, limited communications range, and by implication, the power budget, of near-ultrasonic communications in this context is rather problematic. 
5. It would seem that for all of its problems, Bluetooth is a better solution, due to its vastly superior performance in terms of range, lack of audible artefacts and existing wide-spread deployment.
THEREFORE
6. I do not recommend further pursuit of creating a near-ultrasonic contact tracing system using mobile phone hardware at this point in time.  This conclusion does not impact on the creation of bespoke devices that avoid these problems through careful design.
IN ADDITION
7. While not well suited to a contact tracing use-case, we have established near ultra-sound as a quite feasible means of digital communications using off-the-shelf mobile phone parts. This may be of assistance to people operating in areas where electromagnetic communications are denied or heavily surveiled.  

Appendix: Experimental Verification

It's all well and good to say that the above should work. But as we know, saying should in computer science usually means "won't despite all expectation". Or as my German friends like to remind me "eigentlich ist die stärkste Verneinung", i.e, "Actually" or "In Reality" is the strongest contradiction.  Thus we want to make sure that we are not barking up the wrong tree, and want to at least observe and characterise the performance of the MEGAphone's actual near ultra-sound performance.

My approach here is to produce an ultra-sound tone using the speaker, and then pick that tone up using the microphone array.  This will done on the MEGAphone using some software I have written for this purpose.

The first step is to make sure that the MEGAphone microphones really are sensitive to ultra-sound.  For this, I used a tone generator app on my boring Android phone to make sure that I can pick it up.  The MEGAphone R1 prototype has one dead MEMS microphone, so I had to fiddle to pick one that was working, and that was conveniently located where I could get to it. This let me test that I could receive a 20kHz tone. The signal to noise ratio (SNR) I didn't measure, because I don't know the actual SPL loudness of the phone's speaker that frequency, or what funny filtering the Android phone has.  I did discover that my daughter can hear upto 19kHz quite fine, when she started to complain while I was testing ;)

So now the next step is to modify the MEGA65 test programme I wrote so that it can also produce the tone.  This will be fairly easy, as the MEGA65's audio output is continuously integrated using a pulse density encoder (PDR), so sample rates up in the ultrasound range shouldn't be a problem.

Also, running the CPU at 40MHz gives us plenty of horse power to do this -- or at least I hope so. I will run a non-maskable interrupt at some multiple of the target frequency, driving a small sine curve table of values at maximum volume.
The interrupt loop will need to be quite tight, but it should be okay. Something like:

sampleplayernmi:
   PHA
   PHX
   LDX $FD
   LDA ($E0),X
   INXaa
   TXA
   AND #$0F
   STA $FD
   PLX
   PLA
   RTI

That will take something like ~35 cycles, including the interrupt entry and exit, plus a few cycles jitter while the CPU finishes whatever instruction that the CPU was executing when the interrupt is triggered. So close to 40 cycles = 1uSec. Thus our over-sampled frequency can be up to 1MHz.  If we have 16 entries in our sine table, then this allows up to 1MHz / 16 = ~64kHz as our maximum effective frequency.  That should be sufficient, since we only need 1/2 that.  Thus about 1/2 the CPU time will be spent on the interrupt, leaving the other half to read the microphone samples and visualise them.

Ideally we would synchronise the sample reading with the writing, as this would give us a display that is always synchronised with the tone we are transmitting.  So I might rework the above to read and stash the microphone samples, as well. That will eat more CPU time, though, so I'll probably have to optimise it carefully.  It might be, that instead of using an NMI to drive this, that I just have a tight loop that does both, and interleaves this with doing the visualisation.  The goal here is not something that is perfect, but rather, something that demonstrates that we can produce and receive a signal.

As I was thinking about it overnight, it occurred to me that I might just be able to use the SID chip implementations in the MEGA65 to produce a tone at the target frequency.  This would avoid all the CPU timing complications.  The catch is that the SID doesn't do pure sine waves, but only triangle, saw-tooth and square wave waveforms.

The problem here is that this means that it will introduce harmonics.  I don't know if the harmonics will only be at higher frequencies -- and thus not audible -- or whether there will also be audible harmonics on the lower end.  I'm not a signal processing expert, but my intuition and little bit of radio experience tells me that we can expect at least hetrodyning from reflections. That is, the signal will mix with its reflections, and this will produce the sum and difference of each frequency present in the signal.  If any of those harmonics are loud enough, it could produce an audible artefact. The good news is that this is, by definition, testable.  So I can live with that.

So, coming back to producing tones with the SID chip, I need to find out what the maximum frequency that the SID can produce is. Fortunately I have done a bit of mucking about inside the VHDL SID implementation we are using, so I know, for example, that it internally uses a clock in the 10s of MHz to generate everything, so tens of KHz shouldn't be a problem. Also, the SID chip's frequency generation formulae are well known, so we should just be able to work out the correct register settings, and again, just try to produce something.


Let's start with working out the register settings... and here we hit a snag: Although the internals of the SID chip are capable of much higher frequencies, the registers for the frequency generators can't go above about 4KHz, because the 16-bit frequency values don't offer enough dynamic range.

Back to the drawing board then. The 16-bit digital sample registers on the MEGA65 will be much more effect, and allow us to generate close to a sine-wave, as previously discussed. The only problem is feeding them fast enough. It might be possible to make a really tight CPU loop to do this, but it will be a bit of a pain for playing the tone while also processing the incoming signal. 

What I am thinking for a solution to this is to add Amiga-style intelligent audio DMA logic, so that I can point the machine to a sample table, and have it play the sine curve over and over at arbitrary sample frequencies.  It will also be handy for the MEGA65 retro-computer in any case.  For this, we will need the following information for each channel:

1. Base address of the sample data (28-bit address)
2. Length of the sample (16-bit number of samples)
3. Sample frequency
4. Volume level (maybe)
5. Flag to select 4, 8 or 16-bit samples (maybe)

Those last two are maybes, because they aren't strictly required, but will give more flexible output and make more effective use of memory, respectively.

The next question is where to locate them in the system. Unlike the Amiga, the MEGA65 has the CPU always being the bus-master.  This means that it will end up in the CPU one way or another.  There is a DMA controller in the CPU already, and that could be hacked to provide the means of setting things up.

It also actually reminds me of another big problem for the MEGA65 at least:  DMA jobs on the MEGA65 are not interruptable. This means that if I implement this sample playback method, the sound will pause whenever a DMA job is running.  With a bit of clever pre-fetch and buffering, I can probably hide this problem for all but the longest running of DMA jobs.  Given a CPU frequency of ~40MHz, and the maximum 64KB DMA copy requiring ~130Kcycles (or ~260Kcycles for swaps, when they are implemented), that corresponds to about 7 milliseconds.  If the programmer were disciplined, and broke the DMA jobs down into 1KB pieces, then we can get that figure down by close to two orders of magnitude, to ~0.1 milliseconds, which corresponds to a sample frequency >8KHz.  That sounds like a reasonable situation for now, and I can always make the DMA jobs interruptable by the audio data fetch sometime down the track.

Anyway, all of the above means that a buffer of even just one sample per channel should allow decent audio, even with DMAs running.  If I make the buffer even just 4 or 8 samples deep, then much higher frequency audio should be possible -- allowing us to produce ultra-sound while running any little DMA jobs that we might need in the test programmes I will need to write.

Since I've just talked myself into using the DMA controller, I might just make the control for this via unused extended DMA job options, rather than having more memory mapped registers.  It also means that setting up a particular sample will be easier in practice, because you will just be able to trigger the relevant audio DMA job, rather than having to stuff a pile of registers.  I'm liking this plan even more... except it will be a pain for freezing.  So I'll have to find some spare memory-mapped register space, after all. Oh well, it sounded like a great idea.

Time to get cracking: First I need to setup the data structures, and create the memory mapped registers to access them. Those are $D720-$D75F, with $10 registers per channel.  To that I have added all the behind-the-scenes plumbing that does the sample fetching, volume control and mixing.  The whole setup is not over complicated, but simulation is still the best way to ensure correct behaviour. And that's where the problem has arisen...

When I simulate it, GHDL is not incrementing various counters in this new stuff correctly.  This seems to be because it thinks that lines are being cross-driven.  The trouble is, all the inputs to the calculations don't seem to have any undefined or other invalid values. This led to a whole rabbit-hole of investigation that took several full days to explore, trying to get backtraces working in GHDL, so that I can see where the meta-values are being produced, as well as dealing with GHDL crashing during compiling the code for synthesis, among others.

After various adventures, I got GHDL built using the GCC back-end with at least rudimentary back-trace support.  So now I am working through the process of finding and eliminating the meta-value problems, so that I can hopefully get to the one that is causing my simulation problems.  Simultaneously, I have been making small incremental changes to the VHDL and synthesising, to progressively inch towards making the audio DMA stuff actually work.  The whole idea of simulation was to avoid this slow process, but here we are with both both running neck and neck, as to which will yield results first.

One thing I realised I needed to add, is a mechanism to prevent the audio DMA from hogging the bus. To do this, I have implemented a hold-off timer, that prevents any two audio DMA cycles occurring with less than 8 cycles between them. Combined with the fact that the DMA cycles cannot interrupt a CPU instruction, this should allow for reasonable processor performance, even when the DMA rate is set reasonably high. Also, it means that we should be able to get around 2 million audio DMA operations per second.  Since we need only one audio DMA channel for the ultrasound, this should be plenty.

Well, that all took longer than intended, and I'm not yet sure that it is all 100% correct. There are some niggling CPU timing problems, that mean that the audio DMA only works reliably when the CPU is set to 40MHz, and not in the Hypervisor. More precisely, it probably will get upset if the CPU is running in anything that is not the main memory.  Anyway, for now, those are reasonable limitations that I can work within.

The audio DMA system works by adding a 24-bit fixed-point fractional increment to the sample counter. When it reaches 1, then its time for a new sample. This means that the sample rate will be CPU CLOCK SPEED * FRACTION.  If for simplicity, we call the fraction a simple 24 bit integer, and substituting the CPU speed of 40.5MHz in, we get:

SAMPLE RATE = 40,500,000 * (SPEED / 2^24)

And thus by rearranging, we can find the SPEED value required for any particular sample rate:

SPEED / (2^24) = SAMPLE RATE / 40,500,000

SPEED = SAMPLE RATE * (2^24) / 40,500,000

SPEED = SAMPLE RATE * 0.414252

This means that we can achieve sample rates all the way up to the CPU clock speed, and down to about 3Hz.  In practice, it is limited to around 1 - 2MHz due to the bus saturation limit, and also because the audio cross-bar mixer effectively places an upper-limit on the sample rate that will come out the speaker.  If the audio cross-bar is limiting the frequency too much, we can make a bypass for that, so that we can increase our upper frequency limit.

For a little test, I have made a 16-entry 8-bit Sine table, so that I can produce a pure tone for calibrating frequency and testing the volume at varying ultrasonic frequencies.  It sounds generally ok, but even I can hear that it isn't a pure tone. So I might increase the sample count and go to 16-bit samples, so that it sounds better.

Well, that didn't work.  I even tried it on the MEGAphone prototype, in case it was the audio output circuitry. However, fiddling around, I did discover something important: The distortion changes based on the code the CPU is running.  I even made a little loop that confirmed that the opcode of an instruction is ending up in the audio stream, by changing the bytes I am playing in the sample loop, and finding the silent point occurred when the bytes all matched the opcode of the loop.  More investigation reveals that it isn't just opcode bytes that can show up, but seemingly any CPU memory access.  Also, I was seeing that just occassionally the CPU would pick up a byte from the audio DMA data, and jump off into lala land as a result.

This is rather annoying and a bit worrying, as it means that there is a bigger problem with bus timing than I had expected. I already knew that I had to be careful with not allowing the audio DMAs in hypervisor mode, and only at 40MHz because of funny business. And it now seems that this problems are much more significant than I had hoped.  But because they don't show up in simulation, they are a bit of a pain to track down. 

What I might do here is a bit of a pragmatic solution, and make the bus wait an extra cycle when starting an audio DMA so that the value we want REALLY shows up, and then allow the bus to settle for a cycle back on whatever the CPU was asking for before the DMA.  Another slightly more elegant solution would be to use the dead read wait states in the CPU. But that is rather more complex to implement.

Taking a look at the synthesis logs, it looks like the audio DMA stuff has pushed the tolerance for timing closure on the memory controller in the CPU out the window -- which would explain just about everything.  The trick is how to simplify things back down, so that the logic becomes shallow enough again to get closure. 

It looks like it might be easier to use the read wait-state cycles after all.  Those were all reading from address $000002, so they were easy to find.  I've now gone through and refactored all the audio DMA scheduling into a more generic "background DMA" framework, where the sample LSB and MSB reads for the four channels are considered 8 separate DMA targets. It also means we can add other interesting background DMA actions in the future.

For not the first time in this adventure, I have had something that ALMOST works, but not quite. Sometimes it would simulate, but not synthesise due to multiple drivers, or there would be funny corner-cases where it wouldn't correctly realise when the shadow RAM was reading the correct location for the background DMA activitity.  The whole interplay for reading the shadow RAM with effectively zero waitstates is just a pain, but we have to work with it for now. 

I think I now have it so that whenever the shadow RAM is not being used, that the CPU does a background DMA read, and correctly latches the data into the audio registers.  While I was doing all that, I also overhauled the audio mixer to use signed samples the whole way through, so that mixing the audio can be done more easily, and without introducing DC biases like unsigned samples do.
It now runs under simulation, with the background DMA reads happening when the CPU reads from IO, or has a wait-state: Basically it makes the background DMA activity the default, and it is only if a different address is presented to the shadow RAM bus that it does something else.  So the moment of proof comes now, while I wait for it to synthesise again...

And again. And a few times after that. But I now finally have a nicely working DMA audio engine.  In fact, its now good enough that it can play Amiga MOD files with a crude little tracker that I wrote to test it.  Which actually works quite nicely. It can already play a reasonable variety of MOD files, but doesn't yet support most of the MOD effects -- only tempo and instrument volume.

But it does already support repeating samples, too, since I need that for the ultrasound testing, because it would be a pain to have to keep feeding sample data in. Instead, I can just play a 32-byte sine wave loop indefinitely.  The result is this:


From there, I have been working on a little test programme that just plays a continuous sine wave tone, with variable frequency.  The Audio DMA can, in theory, play a sample every tick on the 40.5MHz CPU clock. However, in practice it is limited to about 1/16th of that, i.e., about 2.53MHz. Still not bad.  Our sine wave sample is 32 bytes long, so that means we have a maximum of around 80KHz.  This is comfortably well above our target range of 20 - 30KHz -- and remember that this is not the frequency at which it can play a horrible square wave, it is the frequency at which it can play a pretty nice sine wave -- and all this without the CPU needing to do anything once we set it going.  So I'm pretty happy that we have the audio sub-system in place that will let us produce ultrasonic frequencies. 

Now, back to that test programme: It plays the 32-byte sine wave in an infinite loop, and lets us vary the frequency and volume.  It also shows a nice little oscilloscope display of what it thinks it is playing:


Here we can see it set to ~40KHz, above what we need, and the 40MHz 8-bit CPU is doing a fine job of displaying this in real-time.  It is currently fixed with respect to the sweep time, with the 256 pixel wide display corresponding to about 130 microseconds as a natural consequence of the tight sample reading loop. It took a little bit of fiddling to get the programme right, though:

First, I had to synchronise the sweep to always start at the beginning of a loop of the sample. This helps to hold the display with a fairly steady phase. 

Second, I was using DMA to update the display with a double-buffered arrangement I had made for another programme. However, that can't be used here, because foreground DMAs, such as the double-buffer copying, cause the audio DMA in the background to pause. This was causing quite horrible audio artefacts as the tone would be interrupted tens of times per second for a number of milli-seconds.  Eliminating the double-buffer and all other foreground DMAs fixed that problem.  This is one of the things about this audio sub-system that leave it clearly in the 8-bit home computer class, and not in the multimedia PC class where the Amiga lives: There is no prioritised media DMA slots to ensure stable audio at all times.  Demo writers thus get to have fun working out how they can do parts with cool DMA effects *and* still have nice digital audio playing in the background.

Third, even when I fixed those problems, there is still a funny artefact where there are spikes in the audio playback. There are some hints that this might be a glitch when the next sample begins to play. However the audio hardware is so fast -- driving the PDM at 40.5MHz, resulting in ~25ns intervals, makes it hard to capture this reliably on my oscilloscope. This causes wave-forms that look like this:



This problem is manageable, however, as the sine wave is still otherwise intact, and produces an acoustically decent tone.   The glitches do produce some audible artefacts when driving ultrasonic frequencies, but this is all at tolerable levels for the test.  So I'll worry about fixing that later, unless it does turn out to be problematic.

The next step is to add the reading of the microphone data.  Hopefully this will go smoothly, as I have already established that the microphone is sensitive well past 20KHz, and according to the data-sheet, all the way up to the maximum 79KHz that we would realistically generate.

Getting the code running on the MEGAphone prototype was easy, as it is fully compatible with the desktop MEGA65 that I have been developing it on.  The only trick was I had to remember how to control the amplifier on the MEGAphone for the speakers so that I can get the volume loud enough for testing.  This is controlled by $FFD7035 and $FFD7036, where $00 = +24dB, $40 = 0dB and $60 = -24dB and $FF = mute.  It was set to $60 by default and was way too quiet. 

I did some initial testing with $00 (+24dB) and it seems to work, although I think it might send the amplifier into over-current shutdown after a while. That would be ok for this application, as we only need a very low duty-cycle.  What is more of a problem is that at +24dB there are a lot of audible artefacts which I will need to investigate. Once the kids have gone to sleep I'll start experimenting again to see if +0dB is loud enough to be detected at a decent range.

So, in terms of initial tests, I'm initially testing only at very close range, and having to deal with some distortion caused by a bug in the audio DMA subsystem that I have yet to figure out: Basically if the CPU is running, then the waveform is distorted.  I'll likely deal with that by implementing a special "play sine wave" mode for the audio subsystem, so that we can side-step the whole DMA thing altogether. But its still good enough that we can work out where the resonances are that are good enough to receive anything.  So:

There is a peak between about 20.0KHz and 20.2KHz, that ramps up more slowly on the low side, and falls off quite quickly on the high-side.  Then there are also a few peaks in the 25KHz to 30KHz range, but they are lower than the 20.1KHz centred peak. Above 30KHz I haven't seen any signs of life as yet. But this DMA distortion problem is getting worse, the higher the frequency, so I think I need to implement the "play pure sine wave" function, and then test it again.  But I must say: I'm pleasantly surprised that the speak is in fact producing any energy at all above 20KHz.  Whether it has enough range, and whether it can do so with quiet enough audible artefacts remains an open question, though.

I implemented the sine table ROM to avoid the distortion.  There are still audible artefacts, presumably because the MEGA65's audio output is PDM, i.e., lots of 1s and 0s rather than a true sine wave.  As side-effect of fixing this is that the peak response shifted down from ~20.1KHz to around 18KHz.  There is some response up around 26KHz-30KHz, but it is quite weak. 

It looks like the next step required will be somehow filter the signal, so that we get closer to a true sine curve.  The data-sheet for the amplifier suggests using ferite beads on both leads to the speaker, and a 470uF capacitor to GND on the high-side.  At that point, though, we cease to be working with the existing MEGA65 hardware without modification, and we are already well past the scope of this work unit.

What I will still do, is attempt to more adequately quantify the range of ultrasonic frequencies that respond, and what the fall-off looks like, so that we can identify the likely usable frequency range for communications.  Minor speed-bump, though, in the form of the speaker lead breaking:


Fixed that. Next step was to improve the ultrasound test progamme to show some kind of frequency domain break-down.  An FFT would be idea, but a fair bit of work to produce, when I just want to be able to see if there is noticeable energy at a frequency. So I made a simple programme that superimposes the sample train over itself at all possible time deltas, and then measures the energy of the result.  If the delay corresponds to a periodicity of the signal, then it will result in a higher amplitude signal.  I added a bit of filtering to it to deal with harmonics etc, and give or take a few wrinkles, it produces fairly clean peaks for the frequencies at which energy is present:


The read peaks correspond to the periods at which energy is present, i.e., further left is higher frequency, and further right is lower frequency.   The very tall line-like peak on the left edge is an artefact of measuring energy by superimposing waves and looking fo reinforcement, as it will basically pick up when adjacent samples don't zero-cross.  Thus it should be ignored. But we can see the next peak corresponds to 1 cycle of the ~76KHz resonance of the microphone. The other peaks come and go a fair bit, and are less robust, and may correspond to some weaker lower frequency enveloping of the 76KHz resonance.

So I think we are finally set to try various frequencies and measure at which frequencies we reliably see energy, and to delineate that set of frequencies as those which are potentially usable for ultrasound communications on the existing MEGAphone hardware, and thus to give us sufficient characterisation of the available channel bandwidth.

I'll work with the amplifier set to $40, which is some way off the loudest setting, and the speaker leaning against the MEGAphone prototype about 1cm away from the phone.  If we can't pick up a signal under those conditions, then we'll assume it won't be usable. But if we can, then we can repeat the test with some increased distance and power, and see what kind of range is possible.  To support this, I have improved my test programmes to work out the period of the sample frequency requested, and to calculate a running average of the estimated power at that frequency.  In this way, I have an objective and quantitative means of comparing the power at different frequencies, even if the units are undefined because of the estimation process.  The display with this looks like this:


The light blue vertical line is the period we are looking for. This corresponds to exactly one full wave of the tone being played (visualised by the white points). The received return via the microphone is the yellow points, and the red line is the power at each period.

So we can see visually that there is some power at the target frequency here visually, as there is an observable periodicity in the yellow waveform, which is then reflected by the red peak at that point. The running average is then calculated and displayed in the lighter red coloured text. For each sampling, I will collect 63 samples.  I'll also do a quick variance check by doing three successive collections with a single setting, so that we can get an estimate of how reliable the readings are. It was quite pleasing to be able to implement all of this directly on the MEGAphone's 40MHz 8-bit CPU.

Okay, so having batched a bunch of runs at different frequencies, we find the following:

1. Below ~19.4KHz the waveform is very loud.
2. Between ~19.4KHz and 19.7KHz it's not quite as loud, but still very loud, perhaps 1/2 as loud as just below 19.4KHz.
3. From 19.8KHz to 20.4KHz it's about 1/2 as loud again, with relatively flat response.
4. From 20.4KHz to about 21.4KHz it is about as loud as between 19.4KHz and 19.7KHz again.
5. From 21.4KHz it drops off very sharply, with no discernible signal beyond 21.7KHz.

So assuming we want to keep high enough frequency to not annoy people, there is probably about 1.4KHz centred on 20.7KHz that would be usable.  As tempting as it would be from a link-margin perspective, dropping down to below 19.4KHz is probably not feasible.

Increasing the amplifier level from 0dB to +12dB yielded a discernible waveform at distances of around 30 - 40 cm at 19.4KHz -- i.e., where the performance of the system is very good.  The amplifier can go to +24dB, which would thus be expected to perhaps deliver the ~1.5m range, but at a power consumption during transmission of greater than 2W.  It seems rather unlikely that 1.5m range would be obtainable in practice over the whole band, although somewhat shorter range may well be possible, or it may be possible to increase the amplifier power further.

But whatever the frequency and amplifier arrangement, some kind of filtering is going to be required, to prevent the audible artefacts that are present across the whole frequency range tested.

That said, we have clearly proven that the microphones we are using are sensitive to ultrasound. Thus it could be possible to use a smaller speaker with improved ultrasonic properties to further boost performance.

But as I reflect on all this, it seems to me that there are a bunch of problems, that together make me a little sceptical about the utility of such a system in practice:

1. Transmit power of 2W or more is likely required to produce a useful range over a wide-enough ultrasonic band.  Even at a low duty-cycle, this will create a significant power consumption.
2. The speakers we use, and most ultrasonic speakers, are rather directional, making detection of proximity rather unreliable. It was quite fiddly to get the results described above, demonstrating that this is not just a theoretical problem.
3. Point (2) is made worse if you have the device in a bag or pocket.
4. The audible artefacts are REALLY annoying.
5. Some people can likely hear up around 20KHz, and even the occasional short chirp of a packet is going to be very annoying to listen to.
6. Bluetooth is already widely available, requires no new hardware, and for all its problems, has much better performance than I have been able to observe here. In particular, the lack of risk of audible artefacts seems compelling in the circumstances.
7. The need for contact tracing apps seems to have somewhat dissipated, at least for the time being.

However, as a further communications channel between devices when faced with a hostile RF environment, it would seem to have potential.  In that case, it would likely be possible to mask the audible artefacts by, for example, playing music or other sound while the ultrasonic transfer occurs.  In this context the difficulty of creating an ultrasonic signal that can travel great distances becomes a strength, because it means that an effective jamming effort would require considerable proximity. In contrast, 2.4GHz Bluetooth or Wifi are rather easy to jam from a distance. 

Thus, while this wasn't the primary objective of this investigation, it has revealed that such near-ultrasonic communications is quite possible using rather conventional components found in smart-phones and similar devices.  It also means that from a privacy perspective we must take care, as it is similarly possible for devices to communicate via ultrasound without a user's knowledge, e.g., to exfiltrate data across air-gaps.

Wednesday 17 June 2020

Pre-Ordering for the MEGA65 Developer's Kits (DevKits) is nigh!

Goodness me, it's been a long time that we have been working on the MEGA65 nowhelp. But we are finally at the super-exciting end of things where things are starting to happen more quickly.  And that includes this week, when pre-ordering of the DevKits will open over at Trenz Electronic.

So let's talk about what the DevKits are, and what the price will be. 

But before I continue, I just want to be careful to explain that the pricing of the DevKits is not representative of what the price of the final machine will be.  There are a number of reasons for this: 

First up, at this early stage of things, we don't have the price of the various components optimised.  We are buying smaller quantities, and learning how to do everything thing right. That all translates to increased costs for the DevKits. We hope to be able to sell the final machines for less than it costs us to manufacture the parts for a DevKit, for example.

Second, the DevKit release is partly to help early adopters get hold of the machine and start writing software, documentation and other goodies for the community, and partly to provide us the cash-flow to get the production machines out ready as soon as possible.  This means covering the cost of designing and printing the packaging, user guide and other goodies.  While we are already well advanced on some of that, there are some very real costs that we need to cover.

Third, somewhat counter-intuitively, the acylic cases of the DevKits are probably more expensive than what the injection moulded cases of the production machines will cost.  This all comes back to the wonder of injection moulding: You pay a fortune up front, and then you can produce the best cases at the best price for a long time after.  That's where we are aiming to be for the production machines.

All up, expect that the DevKits will be quite noticeably more expensive than the production machines.  Also, remember that the MEGA65 is an open-source computer being produced on a non-profit basis: None of the MEGA65 team earn anything from the sales. It all goes to cover costs and support the completion of the machine.

But let's get back to talking about the DevKits: One of our big goals is to increase the number of people who can develop on, and contribute to the MEGA65.  We see this as a once-in-a-generation opportunity to help shape and be part of the story of The Last 8-bit Computer that never was, and now will be.  There may be other 8-bit systems in the future, but as the spiritual successor and completion of the C65, we think the MEGA65 has a pretty special role to play -- and we'd love to have more folks help us make it as awesome and exciting as we can. 

We want it to be a machine with a variety of software and good compatibility and rock-solid performance when it is ready to arrive under people's Christmas trees, birthday present piles and Retro Rooms, so that we can recreate that "Christmas 1982" feeling one more time for the community. To achieve this, we need as many people to contribute in a variety of ways, whether helping with documentation, C65-fixing existing software, writing new games, programmes and tools, or contibuting to the VHDL and operating system software of the MEGA65 itself. 

It's a radically different model to most computers around today, that are more like mountains upon which we gaze, or perhaps at most seek to climb.  But we want the New Zealand model, where geography isn't just something you look at, but is rather more of a participatory sport.  So to with the MEGA65, it's by participating in the story that you have the most fun, and can share the most joy with the community. Come. Be part of the story with us.

Frequently Asked Questions





Q: What's a DevKit?
A: A development kit aimed at developers so they can start coding software for the machine and even help shaping the final product before it is released.

Q: Why does it look so different?
A: The case is made in a way it can be produced in small batches before injection moulds are finished. Its transparency helps finding out if the smoke stays inside the chips.

Q: I am a collector not a developer.
A: The DevKits have laser-engraved Logos and serial numbers to make them unique. DevKits usually are great collector's items.

Q: I only want to play with it!
A: The DevKit is like a "real" MEGA65 only in a preliminary form. You might encounter hiccups but you can always (soft-)update it.

Q: I do not like the floppy.
A: The DevKits come as (hence the name) kits which include a refurbished floppy drive. Feel free to leave that out and donate it to us.

Q: What will it cost?
A: DevKits are always more expensive than mass-produced machines, they also get strong support from the makers. The MEGA65 DevKit comes with a price tag of EUR 999 which is in fact very low considering the cost of the components, support and general preparations required for this initial production run of machines, as well as our costs of getting the final machines ready for release. The final machines will benefit from these things, and improved economies of scale, which will allow it to be released at a lower price.

Q: Can you build it for me?
A: It's really easy to build, usually under an hour, maybe a bit more if you are clumsy. If you do not dare to build it yourself please ask in our support forum or via the other communication channels you will get access to. There are many nice people around!

Q: I am more interested in the final MEGA65 and not really a developer, how can I support and improve the development of the final machine?
A: Please buy a DevKit and lend/donate it to a talented developer!

Q: When can I buy it?
A: From tomorrow on, but do not wait too long!

Q: I am a blessed developer and want to sacrifice all my time but I do not have any money!
A: Please talk to us about support!

Q: If I buy a DevKit, can I transfer the PCB and keyboard into a MEGA65 case later? Can I even 3D-print my own MEGA65 case?
A: Most probably yes! But we can’t guarantee it.

Friday 12 June 2020

Fit testing the injection moulded case sample

Another post where I promise to be short on the words, so that you can just enjoy seeing the MEGA65 hardware become a reality:  This time we have the components all being put together to make sure that everything fits.

These photos are hot off the press -- as you can read in the labels on the case parts, they were only manufactured on the 8th, and its now early on the 12th.  I'll wait to hear from the team if any problems have been spotted, but so far it looks fantastic to me.  The keyboard has enough clearance around all the keys so that none should jam (a problem on the real C65 prototypes, where the cursor keys and return were rather problematic).

But as promised, I'll now shut up, and let you enjoy the eye candy.











Saturday 6 June 2020

Fixing some floppy bugs

Among everything else, we have been looking at some bugs with the MEGA65's internal floppy controller.  It was working most of the time, but would hang in various situations.

The first problem was that it would hang during loading files a long way from the directory track.  I was worried at first, that it was some problem with the MFM decoding not being good enough.  So I wrote a nice test programme that  reads some MFM decode debug info, and shows a histogram of the gap sizes.  This should result in 3 very clear peaks corresponding to the different bit gap lengths that MFM produces. As the test disk I have here is empty, it's quite heavily skewed, but it is still clear that the peaks are there, are well spaced, and nice and narrow:

The third peak here is really just a little blip, because of the disk being empty. But watching multiple frames, I could see that it is there and real. The colours really just indicate the height of the lines.  The left edge of the chart is shorter intervals, and the right side longer intervals.

I'm actually really happy with this nice little tool, as it runs continuously, and you can swap disks etc, and see the content change.  With a formatted disk, it does several frames per second.  This is of course running natively on the MEGA65.

The video mode is 640x200 using a combination of normal text and 16-colour text mode, where each nybl of a character byte encodes one pixel.  This means the whole screen fits in 640x200x0.5 bytes = ~64KB.  Being able to mix normal chars in makes it much easier (and faster) to draw text over the display.  This all contributes to the quite fast performance, even though I wrote it all in CC65, which while quite handy, doesn't really produce particularly fast compiled C code.  One day we will teach it some of the 4510 and 45GS02's tricks to produce MUCH faster output, but that will have to wait for another day.

Meanwhile, if you are curious what the distribution of an unformatted track looks like, here is an example:





We still see indication of the first and second peaks, perhaps because of some factory formatting artefact or something, or from whatever else was on these disks previously.  But we see the distribution is continuous, and thus it isn't really possible to classify any given sample with certainty. The drop-off on the left edge is presumably due the limit of the magnetic medium.

I find the whole low-level signal processing side of floppies is quite fascinating.  One day when I have time, I want to see just how much data I can cram on a 720K or 1.44MB floppy using modern RLL(2,7) encoding, a single really long sector, variable write speed per track to match the varying linear velocity of the tracks, and using modern error correcting codes to enable us to tolerate some errors.  My gut feeling is that at least double the capacity should be possible.  But that, also, will have to wait for another day.

Anyway, having confirmed that the floppy was being read reliably, I started implementing a random track seek function, so that I could see if it was the seeking that was the problem.  And indeed it was: Sometimes the drive would seek either one track too few or one track too many.

I thought about a few different ways to solve this problem. In the end, I opted to include a feature that makes it easier to use the controller:  If the MFM decoder spots a sector on the track under the head, and it doesn't match the track we are expecting, the controller will step the head one track in the correct direction. It's a bit like an auto-tuner for a celebrity who can't reliably stay on the correct notes, but for floppy drives.

This is nice from a programmer's perspective, because you don't have to step the drive to the right track before scheduling a read or a write. It can still be turned off, if you don't want it, but for most use-cases, its probably a good idea.

With auto-tune implemented, the tracking was now quite reliable.  That fixed the problem of loading files that were a long way from the directory track.  However, loading big files would sometimes hang, and Falk working on the MEGA65 GEOS port was also having drive lock-up problems.  So I enhanced the floppy test utility to include a looping read test.  This reproduced the problem, with the test locking up after random amounts of time.  It would also hang completely if the drive was on an unformatted track, or no disk was insterted.

So I went through the read timeout logic with a fine-tooth comb, and found some corner cases and fixed them.  That got it working nicely. Here is the read test working:


The two-tone green is just so that you can more easily work out which track is involved. Track 0 is on the left, and it will try up to track 85, just because I felt like it.

In the process of this, I also discovered that you can't really trust the side byte in the sector header of disks formatted in a 1581, so I modified the controller so that it only checks the track and sector match.


There will probably be a few more wrinkles to sort out in all this, but its a nice step forward.

Injection Moulding Tooling Update

Just a quick post to give you all an update on the tooling for the MEGA65 case.  As most of you will probably already know, we are having real injection-moulded cases for the MEGA65, thanks to the generosity of the community who collectively donated the cost of producing the moulds -- a total of some 66,000€!

So, we are now at the point where they can do test-runs with the tools.  This will be done to produce a few pieces for fit-testing with the motherboard and keyboard. It will also be used to look for problems in the injection moulding process, e.g., if the plastic doesn't flow to every corner properly, or there are visible artefacts of the plastic flow slowing or doing other strange things. 

There will be more testing, and then finishing touches, like sand-blasting the texture into the mould cavities.  The coloured plastic for the production parts will also be used at the end. In the meantime it is easier for the factory to just use the naturally coloured plastic during testing, so that they don't have to store or throw away plastic they can't use for other things in the meantime.

Anyway, enough blabber from me -- enjoy the photos and short videos showing the tools in action!